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CHAPTER  1: 


INTRODUCTION 


Automated  design  tools  and  integrated  design  flow  methodologies  were  developed  that 
demonstrated  more  than  an  order-of-magnitude  reduction  in  cycle  time  and  cost  for 
mixed  signal  (digital/analog/RF)  and  mixed  electronic-photonic  microsystems,  such  as 
high-clock-rate  and  wide-dynamic-range  Analog-to-Digital  (A/D)  converters.  These  tools 
and  methodologies  combined  several  innovative  synthesis,  simulation  and  optimization 
techniques.  These  techniques  include 

•  A  fast  time  domain  circuit  simulation  method 

•  Model  Order  Reduction  (MOR)  tools 

•  System-level  mixed-signal  circuit  synthesis  and  optimization  tools 

•  Parasitic  extraction  tools 

A  unique  combination  of  these  innovative  techniques  enabled  efficient  design  of  systems 
far  beyond  today’s  state-of-the-art.  These  tools  in  the  future  will  allow  designers  of  high 
performance  mixed  signal  circuits,  such  as  high-order  tunable  delta-sigma  modulators 
and  the  photonic  A/D  converters  operating  at  lO’s  of  GHz  clock  rates,  to  optimize  their 
designs  with  orders  of  magnitude  more  iterations  and  hence  far  better  performance. 

1. 1  Improved  Mixed-Signal  IC  Design  Flow 

Conventional  mixed-signal  circuit  design  proceeds  at  three  levels  as  shown  in  Fig.l.  The 
top  level  (i.e..  Level  1)  is  the  architectural  level  where  the  behavioral  simulation,  circuit 
synthesis  and  system  level  optimization  are  performed.  The  middle  level  (i.e..  Level  2) 
contains  the  detailed  equivalent  circuits  for  all  of  the  building  blocks  found  in  the 
architectural  level.  The  bottom  level  (i.e..  Level  3)  contains  the  layouts  of  the  circuit 
blocks.  At  present,  the  circuit  designer  works  back  and  forth  between  Level  2  and  Level 
1  to  produce  a  reduced  order  model  of  the  circuits.  Usually,  this  effort  requires 
considerable  intuition  on  the  part  of  the  circuit  designer.  Unfortunately,  a  wide  range  of 
parasitic  processes  are  also  produced  as  part  of  laying  out  the  circuit  in  Level  3.  These 
processes  must  be  included  in  the  Level  2  circuit  models  and  incorporated  back  into  the 
Level  1  synthesis  and  optimization. 

With  the  new  HRL  CAD  tool  framework  the  ad  hoc  methodology  described  above  was 
replaced  with  a  consistent  and  seamless  design  flow.  First,  the  link  between  Level  3  and 
Level  2  and  that  between  Level  2  and  Level  1  was  performed  automatically  with  reduced 
order  models.  Secondly,  an  additional  reduced  order  model  linked  Level  3  directly  to 
Level  1.  Thirdly,  the  parasitics  in  Level  3  were  computed  with  a  fast  method.  These 
steps  allowed  synthesis  to  occur  directly  at  Level  1,  including  all  the  parasitics. 
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A  fast  time-domain  simulation  using  a  Jacobian-free  algorithm  was  used  to  support  both 
the  behavioral  and  circuit  level  (Level  1&2)  simulations.  This  new  and  innovative  circuit 
simulation  approach  is  applicable  for  both  architectural  and  transistor-level  time-domain 
circuit  simulations.  The  heart  of  this  method  is  to  partition  the  circuits  into  linear  and 
non-linear  subblocks  and  lump  the  nonlinearities  together  in  a  form  such  that  the  need  of 
calculating  computationally  expensive  Jacobians  is  totally  eliminated.  Some  of  the  salient 
features  on  this  method  are  as  follows: 


•  Works  in  the  time  domain,  so  the  circuit  nonlinearities  can  be  easily  included 
and  treated 

•  Provides  a  uniform  error  distribution  for  the  solution 

•  Is  compatible  with  reduced  order  models 

•  Requires  much  less  simulation  time  than  SPICE  calculations  with  the  same 
accuracy 


1 .1 .2  Model  Order  Reduction 


Model  order  reduction  (MOR)  methods,  such  as  asymptotic  waveform  evaluation,  Pade- 
via-Lanczos  and  Amoldi  algorithms,  provide  macro-model  representations  of  system  and 
subsystem  level  circuit  blocks  in  a  form  appropriate  for  network-oriented  SPICE  and 
behavioral  system-level  circuit  simulations.  The  resultant  macro-models  can  capture  all 
the  necessary  characteristics  of  the  original  multi-port  active  and  passive  circuits  with 
many  fewer  degrees  of  freedom  by  eliminating  the  internal  nodes  in  the  circuit.  This 
traditional  MOR  method  was  extended  to  a  more  general  Chebyshev-  and  wavelet- 
expansion-based  MOR  method,  which  is  now  part  of  Broadband  efficient  macromodeling 
program  (BEMP). 

BEMP  provides  a  means  for  generating  transfer  functions  of  multi-port  circuits.  The 
transfer  functions  can  be  used  to  connect  between  levels  in  a  design  hierarchy  and  also  to 
enable  reduction  in  CPU  time.  Multi-variable  optimization  and  automated  synthesis 
methods  can  be  developed  around  these  transfer  functions 

1.1.3  Parasitic  Extraction 

We  developed  a  fast,  high-order,  frequency-domain,  full-wave  electromagnetic 
simulation  algorithm  for  effective  parasitic  extraction  of  mixed  signal  circuits  directly 
from  the  layout.  The  method  is  based  upon  a  high-order,  method-of-moment  solution 
scheme  for  the  Mixed  Potential  Integral  Equation  (MPIE)  and  a  fast  iterative  algorithm 
based  on  the  high-order  non-uniform  FFT  method.  This  innovative  approach  uniquely 
combines  a  quadrature-point-based  discretization  scheme  for  the  MPIE  and  the  high- 
order  pre-corrected  FFT  method  so  that  the  algorithm  scales  as  O(NlogN).  This  results  in 
reduced  Central  Processing  Unit  (CPU)  time  and  memory  requirements  compared  to 
other  frequency-domain  full  wave  electromagnetic  simulation  techniques,  such  as  the 
method  of  moments,  the  finite  element  method,  and  the  method  of  lines.  The  initial 
parasitic  extraction  tool  was  able  to  analyze  dense  and  complex  passive  printed  circuits 
on  a  one-layer  conductor-backed  planar  substrate.  The  outputs  of  the  parasitic  extraction 
tool  were  used  to  calculate  wideband  frequency-domain  scattering  parameters  of  the 
circuit,  which  are  the  input  parameters  to  the  MOR  tools. 

1.2  Overall  Program  Milestones 

The  goal  for  milestones  are  as  listed  for  the  Baseline  Program  for  Baseline  Program 
Schedule. 
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MILESTONE 

Comments 

Fast  solution  for  circuit  simulations 

Yes 

HAARSPICE-ELD  algorithm 
shown  to  be  lOX  better  than 
SPICE 

Circuit  equations  in  terms  of  wavelet 
bases 

Yes 

Developed  connection 
coefficient  operator  matrices 
for  all  Daubechies  class  of 
wavelets 

Adaptive  wavelet  expansion  functions 

Yes 

Developed  but  not 
implemented  in  HAARSPICE 

Circuit  equations  in  terms  of  wavelet 
bases 

Yes 

Developed  but  not 
implemented  in  HAARSPICE 

Automated  prototype  for  wavelet-based 
time  domain  algorithm 

Yes 

Developed  first  time-domain 
DAE  solver  using  Haar 
wavelets 

MOR  methods  and  algorithms 

Yes 

Developed  and  implemented  in 
BEMP  1.0  release 

Special  MOR  in  time  domain 

Yes 

Developed  and  implemented  in 
BEMP  3.0  released 

MOR  for  active  components 

Yes 

Developed  Verilog-AMS 
models  of  4‘'’-order,  2"‘^-order 

CT  delata-sigma  modulators 

Automated  prototype  for  MOR 
algorithms 

Model  library  development 

Yes 

Developed  for  HAARSPICE 

CADENCE-compatible 

NEOCADLib 

Automated  prototype  for  circuit  and 
optimization  algorithms 

No 

Currently  HAARSPICE  does 
not  support  optimization. 

Fast  time-domain  circuit  simulator 
validation 

Yes 

Demonstrated  at  Neocad  Final 
Review  Meeting  (San  Diego) 
HAARSPICE  2.0  and  User 
Manual  Released 

Circuit  synthesis  and  optimization  tool 
validation 

No 

Currently  HAARSPICE  does 
not  support  synthesis. 

Fast  high-order,  frequency-domain,  full- 
wave  EM  simulation  algorithm 

Yes 

QS-PCFFT  algorithm 
developed 

Automated  prototype  of  EM  simulation 
algorithm 

Yes 

QS-PCFFT  validated  and 
benchmarked 

Final  report 

Yes 

This  Document 

Table  1.  NeoCAD  Milestones 
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CHAPTER  2  FAST  TIME  DOMAIN  MIXED-SIGNAL  CIRCUIT 

SIMULATION 


2.1  HAARSPICE  Algorithms 

We  begin  this  ehapter  by  describing  the  details  of  the  algorithms  used  in  HAARSPICE. 
Section  2.1  describes  the  mathematics  behind  circuit  simulators  and  formulates  the  solver 
method  used  by  the  algorithms.  Section  2.2  of  this  chapter  then  describes  the  software 
architecture  of  HAARSPICE  and  its  implementation  using  the  Object  Oriented 
programming  (OOP)  methods.  Finally,  in  section  2.3  we  present  benchmark  results  and 
list  the  performance  metrics. 

2.1.1  Mathematical  Background 

Circuits  are  represented  as  differential  algebraic  equations  (DAEs)  using  modified  nodal 
analysis  technique  (described  in  sec.  2.2.2).  In  the  matrix  form  the  DAEs  can  be  as 
follows: 

Cx(t)  +  Gx  (t)  =  F  (xit))  +  u(t)  (1) 

where  C  =  is  the  capacitance  matrix  used  for  representing  magnetic  and  electric 
storage  devices  (i.e.  inductors  and  capacitors),  G=  the  transconductance  matrix 

representing  the  lossy  elements  (i.e.  resistors  and  transformers)  in  the  circuit,  u(t)  is  the 
known  input  at  a  given  node  and  F  =  is  the  non-linear  operator  matrix  of  M 
unknowns  (i.e.  x(t)).  Integration  of  Eq.  (1)  gives, 

i  t  t 

C[x(0- a:(-oo)]  +  <7  jrfCA:(C)=  \dt'F  {x{t'))  +  \dt'u{t')  (2) 


The  unknowns,  x(t),  and  the  input,  u(t),  can  be  expanded  in  W  known  functions,  ^,.(0  , 

«  ( O  =  Z  ( O  «  ,  ^  ( O  =  Z  i 

i  =  l  i  = 1 

where  ^  0 ,  as  ^  oo  .  Therefore  for  time  steps 

=  fdlW) 

and  the  integral  form  can  be  written  in  terms  of  ic;,^,.(0  as 

C^j,x,  +  Gi:j,x,=  ]dt'F(t',x{t'))  +  i:j,u,  (3) 

—  00 

Approximate  F(.x:(0)  =  approximate  the  coefficients  by  sampling. 

Thus,  fi  =  <I>y.F(tj,OjiXi) .  The  Eq.  3  can  now  be  solved  in  discrete  steps. 
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{CO) ,,  +  GE ,,  }x,  =  Yj,{0  -}F(tj ,  (D  )  +  «, }  (4) 


The  form  described  in  Eq.  4  can  be  used  to  iteratively  solve  the  linear  equations  for  Jc,. . 

Iterative  solvers  are  appealing  because  they  automatically  use  the  sparseness  of  C  and  G. 
Though  G  is  non-singular,  it  can  be  extremely  ill-conditioned  making  iterative  methods 
ineffective. 

A  non-iterative  linear  solver  is  obtained  by  multiplying  Eq.  (4)  by  G”'  and 
setting  =  G~'C 

One  can  express  A  in  the  eigen-basis,  A  -  SXS  ~' ,  with  A  the  diagonal  matrix  of  the 
eigenvalues  of  A  and  S  the  eigenvectors.  With  y.  =  S'^x.andB  =  , 


For  non-symmetric  A,  X  and  S  are  complex. 

2.1.2  Wavelet  Expansions 

The  essential  property  of  wavelets  used  to  develop  fast  methods  is  the  compact  support  of 
the  basis,  (/) .  Compact  support  makes  the  matrix  sparse  with  non-zero  elements 

close  to  the  diagonal.  The  integral  operator  becomes  close  to  an  upper  triangular 

matrix.  The  specifics  of  the  wavelets  enable  one  to  make  definite  statements  about 

andE^.. .  A  particular  well-known  property  is  that  both  and  E^,.  can  be  made  Toeplitz 

with  zero  padding,  enabling  a  fast  FFT  convolution  operation  of  order  0(NlogN)  to 
evaluate  their  inverses.  We  begin  with  using  the  most  basic  type  of  wavelets  called  the 
Haar  Basis  wavelets  in  Eq.  6. 


In  the  Haar  Wavelets  sampled  at  tj  =  (y  -  j)N ,  the  measure  of  support  is  A  =  l/  N . 
Therefore  Oy  =  <5^,. /A and  E^.,.  =0(y-i)  where 


0(«) 


1  «  >  0 
■1/2  n  =  0 
0  «  <  0 


In  the  Haar  basis  Eq.  6  becomes 
A 
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The  operation  count  needed  to  solve  this  equation  can  be  estimated.  Inversion  of 
operators  of  the  form  a^Sy  +  ©(j  -  i)  is  an  0(N )  operation.  Inversion  for  all  is 

0(MN).  Application  of  S  to  all  y.  is  O(M^N)  .  Evaluation  of  Fis  also  O(M^N)  though  in 
general,  F  is  very  sparse  in  M.  The  application  of  B  and  the  map  y^  — >  Jc.is  O(M^N). 

Thus,  the  method  scales  approximately  as  O(M^N)  ,  which  is  the  same  scaling  as  implicit 
time-domain  methods  when  a  dense  inverse  of  the  system  matrix  is  used. 

2.1.3  Algorithm  Steps 

For  each  unknown  {x(0}m  a  unique  set  of  wavelets  can  be  chosen.  Within  the  Haar 

wavelet  space,  the  number  of  functions  can  be  chosen  dependent  on  m,Nm-  If  the  circuit  is 
characterized  by  a  few  high-frequency  signals  and  many  low  frequency  signals,  selection 
of  a  wavelet  basis  specific  to  signal  bandwidth  leads  to  a  considerable  reduction  in 
simulation  time.  Based  on  this  principal  we  developed  an  algorithm  that  is  1 0x  faster 
than  the  commercial  SPICE  solver  for  time-domain  simulations  of  mixed-signal  circuits. 

Algorithm  :  Eigenspace  Linear  Domain  Solver  (ELD) 

Step  1 :  Partition  circuit  into  Analog  and  Digital  sub-blocks 

Assumption  1:  All  Digital  Blocks  are  synchronous  type. 

Assumption  2:  Time  invariance  between  clock  edges. 

Step  2:  Find  DC  operating  point  using  homotopy  methods 

Homotopy  methods  used:  arc  length  and  Gmin  stepping 

Step  3:  Solve  Differential  Algebraic  Equation  (i.e.  Eq.  6) 

Y(t)  =  exp(h/A)*Y(t-h) 

A  -  Eigenvalues 

h  -  Variable  time  step  based  on  Error  analogous  to  wavelet  resolution 

Note  :  Matrix- Vector  Computations  done  on  multiple  Linux  Clusters 
gives  7-1  Ox  Speed  up  in  comparison  with  conventional  time  stepping  methods 

The  benchmark  results  of  the  above  algorithm  can  be  found  in  Section  2.3  of  this  chapter. 


Other  algorithms  developed  include; 

ESE  -  Eigenspace  Evolution  Non  Linear  Domain  Solver  (Jacobian  Free) 

ENL  -  Eigenspace  Non  Linear  Domain  Solver 

Details  of  both  of  the  above  mentioned  algorithms  are  found  in  the  HAARSPICE  Manual 
reprinted  in  Appendix  A  of  this  report. 


10 


2.2 


Implementation  in  HAARSPICE 


We  have  used  an  Object  Oriented  Programming  (OOP)  approach  in  the  development  of 
HAARSPICE.  OOP  allows  software  developers  to  come  up  with  classes  that  have  data 
and  functions  encapsulated  into  self-contained  entities.  Besides  data  and  function 
encapsulation,  OOP  provides  other  features,  such  as  function  overloading,  inheritance 
and  function  hiding,  making  OOP  a  desirable  tool  in  the  development  of  circuit 
simulators.  In  this  chapter  we  describe  the  HAARSPICE  Programming  Model,  Matrix 
formulation  and  Command  structure.  For  more  details  on  HAARSPICE  syntax  and 
operation,  the  reader  can  refer  to  HAARSPICE  User’s  Manual  in  Appendix  A  of  this 
report. 

2.2.1  HAARSPICE  Programming  Model 


Fig.  2  shows  the  general  framework  of  our  simulator.  In  HAARSPICE  a  circuit  is 
described  using  a  text  file  containing  a  list  of  the  required  electronic  components.  The 
circuit  netlist  is  obtained  using  a  set  of  standalone  commands  in  which  each  command 
performs  a  particular  type  of  analysis.  HAARSPICE  commands  will  be  explained  later  in 
this  chapter. 

Every  line  in  the  netlist  is  read  as  a  “StreamElement”.  HAARSPICE  parses  the  stream 
element  into  different  classes,  such  as  the  “Component”  class,  the  “Model”  class  and 
the  “Subcircuit”  class. 


I  CurrentSrc 


VoltageSrc 


CCCS 

CCVS 

VCVS 

VCCS 


Ia  "  StreamElement  . 


Subcircuit 


Fig.  2:  HAARSPICE  Programming  Model 
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In  the  component  class  we  have  the  non-linear  and  linear  components  used  in  our 
circuits.  Fig.  2  shows  all  the  elements  in  the  component  class.  The  OOP  method  used  in 
HAARSPICE  makes  adding  new  classes  or  components  relatively  easy  compared  to 
SPICE3.  Adding  a  new  model  in  SPICE3  requires  modifying  over  30  C-files  while  in 
HAARSPICE  adding  a  new  model  requires  making  only  a  few  changes  to  the  component 
class. 

Below  is  the  structure  of  the  Diode  Class  used  in  HAARSPICE. 

class  Diode  :  public  Component  { 
public: 

DiodeO  :  _lambda(1)  {} 

Diode(const  StreamElement&  def,  class  StringTolndex&  stoi); 
virtual  void  F(RArray&,  double,  const  RArray&)  const; 
virtual  void  JacobianF(RMatrix&,  double,  const  RArray&)  const; 
virtual  Component*  create(const  StreamElement&  def,  class 
StringTolndex&  stoi)  const; 

virtual  const  char**  cnames()  const  { return  _cnames; } 

//  returns  true  if  limited  to  devices  conductance, 
virtual  bool  setLeakConductance(double  leakG_); 
virtual  bool  lambda(double  Jambda_)  { return  (_lambda  =  min(1.0, 
_lambda_))  ==  1 .0; } 

virtual  Components  drop(int  gni); 
virtual  vector<int>  giF()  const; 
private: 

DDT3  dQdt; 
bool  dropj; 
double  Is,  Vt,  leakG; 
double  vtm,  bv,  ibv; 

//  charge  based  capacitance 
double  tt,  phi,  cjo,  m,  f1 ,  f2,  f3,  fcp; 
static  const  char*  _cnamesn; 
double  Jambda; 

}; 

The  class  Diode  has  two  public  functions  F()  and  JacobianF()  that  are  used  to  describe 
the  nonlinear  transfer  function  and  its  Jacobian.  The  functions  setLeakConductance 
and  lambda  are  used  for  convergence  and  homotopy  respectively.  The  private  section 
lists  all  the  other  model  parameters,  such  as  saturation  current  (Is),  threshold  voltage 
(Vs),  zero-bias  junction  capacitance  (Cjo),  transit  time  (tt). 

Similar  classes  can  be  added  for  different  devices.  Shown  below  is  another  example  of 
the  BIT  device  class. 


12 


class  BJT  :  public  Component  { 
public: 

BJT() :  IS(0)  {} 

BJT(const  StreamElement&  def,  class  StringTolndex&  stoi); 
virtual  void  F(RArray&,  double,  const  RArray&)  const; 
virtual  void  JacobianF(RMatrix&,  double,  const  RArray&)  const; 
virtual  Component*  create(const  StreamElement&  def,  class 
StringTolndex&  stoi)  const; 

virtual  const  char**  cnames()  const  { return  _cnames; } 
virtual  vector<int>  giF()  const; 
private: 

DDT3  dQdt; 

double  IS,  ISC,  ISE,  NC,  NE,  NF,  NR,  BR,  BF; 
double  VAF,  VAR,  IKF,  IKR; 

double  CJE,VJE,MJE,CJC,VJC,MJC,FC,TR,TF,XTF,ITF; 
double  fbe1  ,fbe2,fbe3,fbc1  ,fbc2,fbc3; 
double  fpbc,fpbe; 

double  F1  (double  m  ,  double  f ,  double  v) 

{ 

return((v/(1  -  m))*(1  -  pow((1  -  f),  1  -  m))); 

} 

double  F2(double  m  ,  double  f) 

{ 

return(pow((1  -  f),  (1  +  m))); 

} 

double  F3(double  m  ,  double  f) 

{ 

return(1  -f*(1  +  m)); 

} 

double  Vt,  gmin; 

static  const  char*  _cnamesQ; 


The  DDT3  structure  used  in  the  BJT  and  diode  class  performs  the  third  order  time 
derivative.  Also  included  are  DDT  and  DDT2  for  linear  and  2"*^  order  time  derivatives. 
Having  seen  the  structure  of  the  models  we  now  introduce  the  structure  of  the  matrices 
that  are  solved  by  HAARSPICE. 

2.2.2  Circuit  Representation  Matrices 

Common  matrix  formulation  methods  for  circuit  analysis  are  State  Variable,  Modified 
Nodal  Analysis  (MNA)  and  the  Tableau  Method  [3].  Inside  HAARSPICE  we  have  used 
the  modified  nodal  matrix  type  of  formulation.  The  general  form  of  MNA  for  a  Linear 
time  Invariant  (LTI)  circuit  is  given  as  follows. 
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Gv{t)  +  C—v{t)  =  Mu{t),  v(0)  =  Vo 
dt 

G  =  all  conductces  and  constants 
C  =  all  capacitor  and  inductor  values 
M  =  connection  to  the  input  source  u(t) 
v{t)  =  vector  of  nodal  voltages 

The  advantage  of  MNA  over  conventional  nodal  matrix  formulations  using  the  Kirchoff 
Current  Law  (KCL)  is  that  it  can  include  independent  voltage  and  dependent  sources  in 
the  G  matrix  by  adding  a  vector  for  branch  currents.  This  helps  in  formulating  a  matrix 
that  can  be  used  to  compute  node  voltages  and  branch  currents  concurrently.  The 
Tableau  method  is  similar  but  includes  all  the  branch  currents  in  the  given  circuit.  The 
matrices  generated  using  the  Tableau  method  are  large  and  sparse.  In  HAARSPICE,  as 
in  conventional  SPICES,  we  generate  modified  nodal  matrices  that  are  sparse  and 
computationally  manageable. 

Below  is  a  simple  resistive  circuit  with  its  equivalent  modified  nodal  matrix 
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7.2. 

The  G-matrix  shown  above  has  two  branch  current  entries  for  the  two  voltage  sources. 
The  datum  or  ground  node  is  set  to  zero  and  is  not  included  in  the  G-matrix  formulation. 
The  versatality  of  the  MNA  formulation  is  seen  in  its  ability  of  matrix  resuse  for  different 
kind  of  analyses.  The  modified  nodal  analysis  matrix  formulation  for  a  simple  RLC 
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network  for  time  domain  analysis  is  shown  in  Fig.  4.  The  inductors  and  capacitors  are  the 
storage  elements  and  are  described  in  the  C-matrix  (i.e.  derivative  matrix). 


SystemEquation  G  x  +  Cx'  =  w 
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Inside  HAARSPICE,  the  G  and  C  matrices  are  stored  in  the  Circuit  class.  Below  is  a 
section  of  the  Circuit  class: 

class  Circuit :  public  mtcode  { 
public: 

CircuitO  :  N(0),  _xA(NULL),  _yA(NULL),  _BuA(NULL),  _BFA(NULL), 
_pxT1A(NULL),  _pyT1A(NULL),  _heapA(NULL),  _run(NULL)  {} 

Circuit(int  N_,  string  _name_) 

:  N(N_),  _name(_name_),  _cnG(0),  _cnS(0),  _singularS(false), 

_gmin(0),  _cmin(0),  _reltol(1  e-3),_abstol(1  e-6),Jabstol(1  e-1 2), 

_G(N_,  N_,  0.0),_C(N_,  N_,  0.0),_xA(NULL),_yA(NULL),  _BuA(NULL). 
_BFA(NULL),  _pxT1A(NULL),_pyT1A(NULL),_heapA(NULL),_run(NULL){} 
Circuit(const  Circuits  c)  { *this  =  c; } 
virtual  ~Circuit(): 

Circuits  operator=(const  Circuits); 

operator  int()  const  { return  N; } 
string  name()  const  { return  _name; } 
virtual  RMatrix  G()  const  { return  _G; } 
virtual  RMatrix  C()  const  { return  _C; } 

//  returns  true  if  all  equations  are  at  limit 
virtual  bool  setLeakConductance(double  leakG); 

//  returns  true  if  all  components  are  1 
virtual  bool  lambda(double  lambda); 

//  F  is  non-linear  function 
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virtual  RArray  F(double,  const  RArray&)  const{return  RArray(): } 

//  returns  dx/dt  =  )(F(x)  -  G  x) 

virtual  RArray  operator()(doubIe  t,  const  RArray&  xA)  const; 

//  for  delay  effects,  time-index,  time-array,  x-matrix,  zero-time,  xero-time  x 
//  thA  and  txM  have  values  from  times  less  than  tA[0]. 
virtual  RArray  F(int  tn, const  RArray&  tA,const  RMatrix&  xM,  double  tO,  const 
RArray&  xOA, 

const  RArray&  thA,  const  RMatrix&  xhM)  const; 
virtual  RArray  u(double)  const  { return  RArray(); } 

//  analytic  jacobian 

virtual  RMatrix  JacobianF(double,  const  RArray&)const{return  RMatrix(); } 

//  numerical  jacobian 

RMatrix  JacobianF(double  t,  const  RArray&  x2A,  const  RArray&  x1A)  const; 


The  G  and  C  matrices  are  of  the  type  RMatrix.  The  node  voltages  and  branch  currents 
computed  by  HAARSPICE  are  stored  in  RArray.  Both  these  structures  are  as  follows: 

#ifndef  RMATRIX_H  #ifndef  RARRAY_H 

#define  RMATRIX_H  #define  RARRAY_H 

#include  <matrix.h>  #include  <array.h> 

typedef  Matrix<double>  RMatrix;  typedef  Array<double>  RArray; 

double  norm(const  RMatrix&  M);  #endif 

double  infinity_norm(const  RMatrix&  M); 

#endif 

With  the  circuit  described  by  the  G  and  C  matrices,  one  can  now  solve  for  the  circuit 
response  using  the  different  algorithms.  For  solving  linear  circuits  we  recommend  using 
the  eigenspace  linear  domain  solver  (eld)  inside  HAARSPICE,  which  gives  the  best 
results  in  terms  of  speed  and  accuracy.  The  details  of  this  solver  are  described  in  the 
earlier  section. 

2.2.3  Simulation  of  Nonlinear  Circuits 

In  conventional  circuit  simulators,  time  domain  analysis  of  nonlinear  circuits  is  done  by 
linearization  of  the  nonlinear  components  during  each  time  step.  In  HAARSPICE  we  use 
two  types  of  algorithms:  the  first  type  uses  conventional  methods  such  as  Newton- 
Raphson  that  require  computing  the  Jacobian  and  the  second  uses  non-Jacobian  methods. 
Shown  in  Fig.  5  is  the  simple  model  of  the  4*'’  order  delta-sigma  modulator  with  all  linear 
components  and  the  quantizer  being  the  only  non-linearity. 

The  nonlinearity  in  a  circuit  is  stored  in  the  Circuit::?  structure  as  shown  below 
RArray  Circuit::F(int  tn,  const  RArray&  tA,  const  RMatrix&  xM,  double  tO,  const 
RArray&  xOA, 

const  RArray&,  const  RMatrix&)  const  { 
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Fig.  5:  4'*'  order  delta-sigma  modulator  with  quantizer  nonlinearity 

HAARSPICE-eld  makes  the  RMatrix  xM  =  elaS(circuit,  N,  runTime)  call  to  solve  this 
circuit.  The  elaS  routine  is  executed  by  RMatrix  ELASolver;:operator() 

The  HAARSPICE-eld  algorithm  performs  the  time-domain  simulation  with  lOX 
improvement  is  speed  in  comparison  with  SPICE.  The  EFT  of  the  output  from  a  order 
delta-sigma  modulator  using  a  16384  point  Blackman  window  is  shown  in  Fig.  6. 


Fig.  6:  A  4*  order  delta-sigma  modulator  output  spectrum 


2.2.4  HAARSPICE  Commands 


The  HAARSPICE  circuit  simulator  is  a  suite  of  programs  used  for  time  domain  and  DC 
analyses  of  linear  and  nonlinear  analog  circuits  and  systems.  The  HAARSPICE  suite  of 
programs  runs  on  Linux  7.02  and  has  a  text-based  interface.  Typing  “haarSPICE”  on 
the  command  line  prompt  yields  a  display  of  the  following  options: 


enl  -  Eigenspace  non-linear  solver  (jacobian) 

eld  -  Eigenspace  linear  domain  solver 

ese  -  Eigenspace  evolution  nl-domain  solver(jacobian  free) 

dcs  -  DC  sweep 

snic-  Supported  netlist  components 
dcp  -  DC  Point  of  circuit  at  t  =  0 

The  current  status  of  HAARSPICE  supported  components  can  be  checked  by  typing  snlc 
(i.e.,  supported  netlist  components),  on  the  command  line  prompt. 

available  netlist  components: 

tag  (...)  atod|AtoD  parameters  ??? 
tag  (...)  bjt|NPN|PNP  parameters  ??? 

tag  (nodeA  nodeB)  capacitor  c=value  "value  of  capacitance  between  nodeA  and 
nodeB" 

tag  (...)  cccs|cccs_hdl  parameters  ??? 

tag  (...)  ccvs  parameters  ??? 

tag  (...)  delay|delay_op  parameters  ??? 

tag  (...)  quantizer3|differentialQuantizer  parameters  ??? 

tag  (...)  diode|diode_hdl|diode_sch  parameters  ??? 

tag  (...)  dsmod|delta-sigma-modulator  parameters  ??? 

tag  (...)  dtoa|DtoA  parameters  ??? 

tag  (nodeA  nodeB)  inductor  l=value  "value  of  inductance  between  nodeA  and 
nodeB" 

tag  (...)  isource|csource  parameters  ??? 
tag  (...)  quantizer1|quantizer  parameters  ??? 

tag  (nodeA  nodeB)  resistor  r=value  "value  of  resistance  between  nodeA  and 
nodeB" 

tag  (...)  sign  parameters  ??? 

tag  (nodeAl  nodeA2  nodeBI  nodeB2)  transformer  n1=value  n2=value2 

"value1/value2  is  gain  between  nodes  A  and  nodes  B" 

tag  (...)  vccs  parameters  ??? 

tag  (...)  vcvs  parameters  ??? 

tag  (...)  vsource  parameters  ??? 

Note  :  ???  denotes  instance  parameters 

The  command  used  to  run  the  different  HAARSPICE  algorithms  has  a  common  structure. 
For  example,  the  command  used  to  execute  the  HAARSPICE-eld  algorithm  is  given  by 

eld  (filenamej-testcircuit)  log2(timesteps)  run-time  [-savenodes]  [-writecircuit] 

filename: 
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Currently,  “HAARSPICE-eld”  supports  Cadence  Analog  Artist  generated  “spectre  like” 
netlist.  The  components  used  are  described  in  the  earlier  section.  All  spectre  related 
analyses  statements  are  ignored  by  HAARSPICE-eld. 


Variables.  Includes  two  sub-attributes: 

timesteps  order:  <4-20>  number  of  timesteps 
runtime:  time  for  which  the  simulation  is  run 


savenodes.  By  default  all  the  nodes  in  the  netlist  are  saved.  Also  includes  branch 
currents  or  pseudo-nodes  that  describe  the  currents  in  a  Modified  Nodal  Matrix. 

Writecircui.  Writes  out  information  about  circuit 

~/eld/order#,  totaltime. 

For  example,  a  directory  generated  by  running  a  simulation  for  4.098  /rs  with  an  order  of 
15  (2’^)  is  named  ^Veld/2E15,4u” 


The  data  files  generated  are  in  ascii  text  format  and  have  4  columns  of  numbers  in  double 


float  type.  An  example  of  a  data  file 

<time>  <value> 

6.2500000000006-05  4. 1 691 60088445e-1 1 

1 .8750000000006-04  5.0000000052006-01 

3.1 250000000006-04  5.0000000071 466-01 


<fr6quency>  <sp6ctral  dansity  /w  Blackman  window> 
0.0000000000006+00  -5.36522491 08376+01 
2.4414062500006-01  -5.3124662069376+01 

4.8828125000006-01  -5.5007547036736+01 


The  command  for  eigenspace  evolution  nonlinear  domain  solver  (ese)  inside 
HAARSPICE  also  has  similar  command  structure  with  the  addition  of  integration_order 
option. 

ese  (-netlistifilename)  integration_order  log2(timesteps)run-time  [-savenodes]  [- 
writecircuit] 

The  command_names[]  structure  stores  the  commands  used  with  HAARSPICE.  One  can 
increase  this  structure  to  add  new  commands.  The  next  section  in  this  chapter  lists  the 
simulation  and  benchmarking  results. 
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2.3  Simulation  Results  and  Benchmarking 

In  this  section  we  present  some  of  the  results  that  were  demonstrated  at  the  final  DARPA 
Neocad  review  meeting  in  Coronado  Bay  on  July  15-16,  2004. 

Performance  Metric 

HAARSPICE-eld  performance  for  a  order  Band  Pass  Delta-Sigma  Modulator  circuit 
was  tested  on  a  DELL  C800  Laptop  running  Linux  (Red  Hat  7.02)  operating  system. 

The  Table  shown  below  gives  the  test  results: _ 


HAARSPICE  -eld 

SPICE/Spectre 

Platform 

63  seconds 

593  seconds 

Linux  7.02 

In  comparison  with  time-stepping  algorithms  used  in  SPICE,  HAARSPICE-eld  gives 
~10X  improvement  in  speed. 

A  performance  metric  giving  the  measure  of  accuracy  is  the  Dynamic  Range  of  the 
simulator.  To  measure  the  DR  of  the  simulator  we  perform  a  two-tone  test  on  the  4*'’ 
order  Delta-Sigma  Modulator. 


4th  order  DS  modulntor 
non-linearity  :  v  v  a=  vl  +  v2 


frequency,  MHz 

Fig.  7:  Spectrum  of  the  Two-Tone  Test  measured  at  the  output  for  different  number  of  clock  cycles. 

The  Two  Tone  Test  has  one  input  signal  m,  at  1008.5  MHz  with  magnitude  of  500mV 
and  a  second  signal  U2,  at  1007.5  MHz  with  a  magnitude  of  0.5mV.  The  nonlinearity 
introduced  is  through  the  Trans-conductance  Cell  (i.e.  Gm-Cell)  of  the  Modulator.  The 
only  other  nonlinearity  is  that  of  the  quantizer.  The  Gm-Cell  nonlinearity  generated  a 
3'^'^  order  harmonic  is  seen  at  2x1008.5  -  1007.5  =  1009.5MHz  as  seen  in  Fig.  7. 
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Performance  Metric 


The  Spurious  Free  Dynamic  Range  in  the  Two-Tone  test  is  taken  as  the  Dynamic  Range 
of  the  simulator. 

From  Fig.  7,  one  can  compute  the  DR  to  be  Pmax  -  Pmin  =  -20dB  -  (-200)  dB  =  180  dB 
To  resolve  the  3'^‘*  order  Harmonic  and  get  a  DR  of  more  than  170  dB  it  took 
HAARSPICE-eld  1  Million  Clock  Cycles. 

The  Table  shown  below  gives  the  DR  with  two-tone  test: _ 


HAARSPICE  (eld  routine) 

SPICE/Spectre 

Clock  Cycles 

180  dB 

173  dB 

512K 

This  makes  HAARSPICE-eld  particularly  useful  in  circuits  for  communications,  radar 
and  other  systems  that  require  high  Dynamic  Range. 

Performance  Metric 

HAARSPICE-ese  performance  for  a  3-latch  Quantizer  used  in  a  4’'’-order  Band-Pass 
Delta-Sigma  Modulator  circuit  was  tested  on  a  DELL  C800  Laptop  running  Linux 
(RedHat  7.02)  operating  system. 

The  Table  shown  below  gives  the  test  results. 


HAARSPICE  (ese) 

HAARSPICE  (enl) 

Generic  Time-stepping  code 

Platform 

400  seconds 

>  5000  seconds 

Linux  7.02 

HAARSPICE-ese  is  not  optimized.  We  are  working  on  adding  ARPACK  routines  to  do 
mathematical  computations.  We  expect  the  optimized  code  to  give  1 0x  speed  up  in 
comparison  with  uniform  and  adaptive  time  stepping  code.  Detailed  simulation  results 
can  be  found  in  the  HAARSPICE  manual  in  the  appendix  section. 


Fig.  2:  3-Latch  Quantizer  output 
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CHAPTER  3  MODEL  ORDER  REDUCTION  TOOL  (BEMP) 


BEMP  (Broadband  Efficient  Macro-modeling  Program)  was  developed  at  the  Georgia 
Institute  of  Technology  to  implement  the  research  conducted  on  model  order  reduction  as 
part  of  the  NeoCAD  program.  The  tool  is  coded  using  the  C++  language  and  is 
executable  on  the  Windows  operating  system.  This  chapter  discusses  the  methodology 
used  in  BEMP  to  obtain  model  order  reduction,  the  associated  algorithms,  and  some 
benchmarking  results. 

BEMP  integrates  the  electromagnetic  behavior  of  passive  systems  into  conventional 
computer-aided-design  (CAD)  tools  through  the  generation  of  multi-port  macro-models 
of  the  passive  systems.  A  macro-model  is  a  black-box  representation  of  a  system  which 
accurately  reproduces  its  port-to-port  behavior  as  shown  in  Fig.  3.1.  Using  such  macro¬ 
models,  designers  can  account  for  the  electromagnetic  effects  of  passive  systems  during 
the  design  and  simulation  of  multi-GHz  electronic  systems. 


Output 

Port 

Bebatlor 


J 


Fig.  3.1:  Macro-model 


3.1  Methodology  used 

Realistic  distributed  passive  networks  operating  over  a  broad  frequency  range  often 
contain  hundreds  of  resonant  peaks  and  the  amplitude  variation  of  the  frequency  response 
can  be  large.  The  representation  of  such  systems  using  macro-models  requires  high  order 
macro-models  with  hundreds  of  resonant  poles.  This  can  be  a  cumbersome  task  due  to  the 
large  amplitude  variation  and  the  broad  frequency  range  to  be  covered,  and  can  lead  to 
numerical  problems.  Hence,  ensuring  accuracy  along  with  stability  and  passivity  over  a 
broad  frequency  range  becomes  a  challenge.  BEMP  overcomes  this  challenge  through  the 
use  of  rational  functions  for  approximating  the  frequency  response  of  a  passive  network. 
The  representation  thus  obtained  is  compensated  for  enforcing  system  characteristics  like 
stability  and  passivity,  while  maintaining  accuracy.  Finally  the  compensated 
representation  is  converted  into  a  lumped-parameter  net-list  that  can  be  directly 
integrated  into  a  circuit  simulator  like  SPICE. 

3.1.1  Rational  Functions 
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Using  rational  functions,  the  frequency  response  H(s)  of  any  linear  time-invariant 
passive  network  can  be  represented  as: 


NS 

E  ansS"® 

=  - 

E 

ds=0 


(1) 


where  s  =  jo),  oo  is  the  angular  frequency  in  radians  per  second,  and  a„s  and  bds  are 
unknown  real  numerator  and  denominator  coefficients,  respectively  [1],  In  (1),  H(s)  can 
be  scattering,  admittance,  or  impedance  parameters  generated  from  an  electromagnetic 
simulation  or  measurements.  The  goal  of  solving  (1)  is  to  represent  the  frequency 
response  H(s)  using  a  rational  function  by  computing  the  orders  of  the  numerator  and 
denominator,  NS  and  DS,  respectively,  and  the  real  vectors  [d\  and  [6]  consisting  of  the 
numerator  and  denominator  coefficients,  respectively. 

Using  filter  theory,  H{s)  in  (1)  can  also  be  represented  as  a  summation  of  filters  in 
the  form: 

H(s)  =  Y,HTp{s) 

m 

+  E«S-(»)  (2) 

+  X/ 

k 

+  X]  ^Ap{^) 


where  H'^lp,  H"bp,  H'^hp  and  I^ap  are  low  pass,  band  pass,  high  pass,  and  all  pass  filters, 
respectively.  The  filters  shown  in  (2)  represent  a  complete  set  for  representing  any 
transfer  function.  The  rational  function  in  (1)  can  be  rewritten  in  a  pole-residue  form  as: 


F(.s)  =  ^ 

da 


I'da 

s  -  Pds 


(3) 


where  pds  and  r^s  are  the  poles  and  residues  of  the  rational  function,  respectively,  and  ds 
is  the  order  of  the  rational  function.  Combining  (2)  and  (3)  results  in: 
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where  p„tr,  r^,  Pnr,  Pni,  oi„,  Pkr,  ^k,  5,  t],  are  real  and  the  poles  are  simple  or  non¬ 
overlapping  poles.  By  combining  the  complex  conjugate  poles,  (4)  can  be  rewritten  as: 
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where  the  superscripts  LPN,  BPN,  and  HPN  are  the  number  of  low  pass,  band  pass,  and 
high  pass  filters,  respectively.  In  (2),  the  low  pass  filter  H”  lp  (s)  consists  of  a  real  pole 
Pmr  with  residue  Vm,  the  band  pass  filter  If  bp  (s)  consists  of  complex  conjugate  poles  pnr  ± 
jpni  with  residues  o^±  /3„;  the  high  pass  filter  ifnp  (■y)  consists  of  a  real  pole  pkr  with 
residue  and  the  all  pass  filter  IPap  (s)  consists  of  residues  5  and  rj. 


3.1.2  Stability  and  Passivity 

Macro-models  constructed  using  rational  functions  need  to  satisfy  the  stability  and 
passivity  conditions  for  a  linear  time  invariant  passive  system.  The  stability  condition 
requires  that  for  a  stable  system,  the  output  response  be  bounded  for  a  bounded  input 
excitation.  Hence,  the  rational  function  representing  a  stable  system  has  to  satisfy  the 
following  stability  constraints: 

1)  The  poles  lie  on  the  left  half  of  the  s-plane,  implying  that  p^r  ^0,  p„r  <  0,  and 
Pkr  ^0  in  the  pole  residue  form  of  the  rational  function. 

2)  The  rational  function  does  not  contain  multiple  poles  along  the  imaginary  axis 
of  the  s-plane. 

3)  The  difference  between  the  numerator  and  denominator  orders  of  the  rational 
function  does  not  exceed  unity,  implying  that  -  D5|  <1 . 

Among  the  above  constraints,  the  second  and  third  constraints  are  enforced  by 
construction  while  the  first  constraint  is  satisfied  in  BEMP  using  a  search  procedure. 

The  passivity  condition  requires  that  a  passive  circuit  does  not  create  energy.  Since  non¬ 
passive  macro-models  combined  with  a  stable  circuit  can  generate  an  unstable  time- 
domain  response,  this  condition  becomes  important  when  the  macro-models  need  to  be 
combined  with  a  larger  circuit  for  time-domain  simulation.  Unlike  the  stability  condition, 
it  is  more  difficult  to  satisfy  the  passivity  condition  during  the  construction  of  macro¬ 
models.  The  passivity  conditions  for  a  multi-port  network  [G(5  =  a  +  yw)]  are  twofold, 
namely,  1)  [G(y*)]  =  [G*(5)]  for  all  s,  where  *  is  the  complex  conjugate  operator  and  2) 
[G(5)]  is  a  positive  real  matrix,  i.e.,  the  product  z*^  [G^  (5*)  +  G(5)]z,  for  all  5  with  Re{s) 
>  0  and  any  arbitrary  vector  z.  When  the  rational  function  matrix  [H{s)]  is  used,  these 
conditions  are  translated  into  the  following  passivity  constraints  [3]: 

1)  [//(5)]  does  not  contain  poles  on  the  right  half  of  the  s-plane. 

2)  [//(i')]  does  not  have  multiple  poles  on  the  imaginary  axis  of  the  s-plane. 

3)  The  coefficients  of  [/f(5)]  are  all  real. 

4)  The  real  part  of  must  be  positive  semi-definite  for  all  frequencies, 

implying  that  the  eigenvalues  of  i?e{[Er(5:)]}  are  positive  or  zero  for  all 
frequencies. 
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The  last  constraint  is  derived  from  the  maximum  modulus  theorem.  Among  the  above 
constraints,  the  first  and  second  constraints  are  included  as  part  of  the  stability  condition 
and  the  third  constraint  is  satisfied  by  construction.  The  fourth  constraint  has  been 
enforced  analytically  in  BEMP  the  details  of  which  are  described  below. 

Based  on  the  maximum  modulus  theorem,  the  passivity  condition  for  a  one-port  network 
can  be  written  as: 

Re{H{s  =  ju})}  >0  ViJ 

It  is  important  to  note  that  s  =jo}  in  (6)  (and  not  s  =  o  +ju)  simplifies  the  derivation  of 
the  analytical  formulae  for  satisfying  passivity  of  the  macro-models.  The  basic  idea 
behind  the  construction  of  passive  macro-models  for  a  passive  system  is  that  the 
summation  of  passive  sub-networks  is  passive.  The  rational  function  H(s)  in  (5)  can  be 
regarded  as  a  summation  of  passive  sub-networks  consisting  of  complex  conjugate  poles 
and  real  poles  with  corresponding  residues,  8  and  r).  If  every  sub-network  in  (5)  satisfies 
the  passivity  condition,  the  rational  function  H(s)  satisfies  the  passivity  condition  as  well. 
Substituting  s  - yco  into  (5),  the  rational  function  H(s  =  ju)  can  be  separated  into  the  real 
and  imaginary  parts  as: 

His  =  joj)  =  Hniju;)  -I-  jHj (jiv)  0) 

By  regrouping  terms,  the  real  Hr{s  =  ju)  and  imaginary  Hj  {s  -yco)  parts  of  the  rational 
function  are  shown  in  (8)  and  (9),  respectively. 
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_4^  y  ^ 
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The  passivity  of  each  sub-network  in  (5)  can  be  satisfied  using  the  following  formulae. 
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(10) 


7m  >  0 
^nPfir  i  f^nPni  ^  0 

i’k  >  0 
<5  >0 


In  (10),  the  analytically  derived  passivity  formulae  can  be  obtained  from  (8)  by  satisfying 
(6).  Equation  (5)  can  be  generalized  for  a  distributed  multi-port  network  containing 
common  poles  as: 


LPN 


m=l 

BPN 


[7m] 

S  Pmr 


Pnr)  ^[/^n]pni 
(S-Pnrf  +7}ii 


(11) 


HPN 


+  E 


S  -Pkr 


+  [<^|  +  [t/I-s 


where  the  residue  matrices  [o^],  [/3„],  [7„,],  [\}/k],  [6],  [i?]  are  a  PxP  matrix  for  a  P-port 
network.  In  (11),  [/f(5)]  is  a  rational  function  matrix,  which  has  to  be  positive  semi- 
defmite  at  all  frequencies.  Using  the  property  of  positive  semi-definiteness,  the  passivity 
formulae  in  (10)  for  a  multi-port  network  can  be  rewritten  as: 

eigenvalues  of  [7^]  >  0 

eigenvalues  of  [-a„p„r  ±  PnPni]  >0  (12) 

eigenvalues  of  [V-a-]  >  0 
eigenvalues  of  [<5]  >  0 

The  following  properties  of  multi -port  passivity  formulae  in  (12)  are  apparent  during  the 
construction  of  the  passive  macromodel.  These  properties  have  been  used  for  the 
construction  of  broadband  passive  macro-models  in  later  sections. 

1)  The  multi-port  passivity  formulae  only  depend  on  the  poles  and  residue 
matrices,  which  are  independent  of  frequency.  Hence,  the  passivity  of  the 
circuit  is  satisfied  over  infinite  frequency. 

2)  The  multi-port  passivity  formulae  are  only  enforced  on  each  sub-network  of 
[i/(5)],  and  there  is  no  relationship  for  passivity  between  sub-networks  except 
that  they  contribute  to  the  overall  response  of  the  macromodel.  This  makes  the 
method  simple  to  use. 

3)  For  compensating  negative  eigenvalues  in  (12),  there  are  two  free  matrix 
variables  [o;,]  and  [|8„]  related  to  two  free  variables  of  complex  conjugate 
poles  Pnr  ±  jpni,  a  free  matrix  variable  [7„,]  related  to  a  real  pole  pmr,  a  free 
matrix  variable  [i/'*]  related  to  a  real  pole  pkr,  and  a  free  matrix  variable  [6]. 
These  can  be  suitably  changed. 

4)  There  are  no  constraints  enforced  on  the  residue  matrix  [rj]. 

The  non-passive  macromodel  that  does  not  satisfy  the  passivity  condition  needs  to 
be  appropriately  modified  so  that  the  macromodel  becomes  passive.  For  enforcing 
passivity  on  generated  macro-model  in  BEMP,  fixed  common  poles  and  symmetric 
residue  matrices  are  assumed  during  compensation.  If  negative  eigenvalues  are  obtained 
in  the  residue  matrices  [7^],  [lA/t],  [5]  in  (12),  negative  eigenvalues  are  set  equal  to  zero  or 
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changed  to  a  small  positive  value  and  then  a  new  residue  matrix  is  reconstructed.  If  the 
passivity  formulae  for  complex  conjugate  poles  with  two  residue  matrix  variables  [a]  and 
[|3]  are  violated,  the  negative  eigenvalues  of  [a]  are  set  equal  to  zero  or  changed  to  a 
small  positive  value  and  then  a  new  residue  matrix  [a]  is  reconstrueted.  Based  on  the 
compensated  matrix  [a],  the  residue  matrix  [/S]  is  iteratively  found  for  satisfying  the 
passivity  formulae  in  (12).  A  small  positive  value  is  used  to  ensure  that  the  maeromodel 
does  not  violate  the  passivity  condition  even  though  this  may  eause  small  numerical 
errors  in  the  solution. 


3.1 .3  Least  Squares  Approximation 


As  mentioned  earlier,  (1)  needs  to  be  solved  to  compute  the  orders,  NS  and  DS, 
and  the  real  eoefficient  vectors,  [a]  and  [6].  Equation  (1)  can  be  rewritten  in  the  form: 

NS  DS 


Y,  -  H{S)  Y  =  0  (13) 

ns=0  ds=0 


For  a  given  H(s)  from  either  measured  or  simulated  data,  which  represents  the  frequency 
response  of  a  one-port  network,  (13)  ean  be  written  as  a  matrix  equation  in  the  form  [3] 


[^] 


a 

b 


=  [0]^[^][.t]  =  [0] 


(14) 


where  the  matrix  [A]  is  given  by 
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DS 

-re[H{sk)  E  4“) 

ds=0 

DS 

-iiv{His,)  E  } 
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(15) 


The  vectors  [a]  and  [6]  in  (14)  are  real  coefficient  vectors  of  the  numerator  and 
denominator,  respectively.  After  pre-multiplying  (14)  with  the  transpose  of  [A],  (14) 


becomes 


[AfMlIi]  -  [0]  (15) 

which  can  be  written  as  an  eigenvalue  equation 

[yl]^[zl][x]  =  (17) 

where  is  the  minimum  eigenvalue  of  the  matrix  [AY  [A],  where  T  is  the  transpose 
operator.  From  (17),  the  eomputation  of  the  real  coefficient  veetor  [x]  requires  the 
estimation  of  the  integer  orders,  NS  and  DS.  The  orders  in  (1)  can  be  estimated  using  the 
minimum  eigenvalue  traeking  method  shown  in  Fig.  3.2,  which  scans  the  orders  NS  and 
DS  over  a  pre-determined  range.  In  Fig.  3.2,  the  minimum  eigenvalue  has  been  plotted  as 
a  fimetion  of  the  orders,  NS  and  DS.  Based  on  (17),  a  non-trivial  solution  exists  when 
==0,  as  shown  in  Fig.  3.2.  Hence,  >y,„„  has  been  used  as  a  parameter  for  tracking  the 
optimum  solution.  In  Fig.  3.2,  the  solution  area  represents  the  region  with  a  valid 
solution,  which  corresponds  to  >v,„«  =  10'’'^. 
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Fig.  3.2  :  Minimum  eigenvalue  versus  the  order  NS  =  DS-1 


From  Fig.  3.2,  for  the  specific  example,  the  orders,  NS  =16  and  DS  =17,  result  in  the 
eigenvector  [x]  corresponding  to  the  minimum  eigenvalue.  The  eigenvector  [x]  contains 
the  coefficient  vectors  [a]  and  [b]  of  the  rational  function.  The  stability  can  now  be 
enforced  on  the  denominator  coefficient  vector  [^i]  of  the  eigenvector  [x]  in  (17)  by 
appl3dng  the  constraints  p^r  ^0,  Pnr  ^  0,  and  pkr  ^0.  BEMP  uses  a  root  finding  method 
for  computing  the  poles  and  discards  any  unstable  poles  obtained  in  the  process.  For 
computing  the  residues  corresponding  to  the  stable  poles,  (18)  is  solved  using  the 
eigenvalue  method  discussed  earlier.  The  matrix  structure  in  (18)  depends  on  the 
application  and  can  be  arbitrary.  For  a  multi-port  network  containing  common  poles,  the 
residues  for  each  port  are  constructed  independently  by  solving  (18).  Finally,  the 
passivity  formulae  in  (12)  are  used  to  enforce  passivity  of  the  rational  function. 
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3.1.4  Broadband  Macro-modeling 


Realistic  distributed  networks  such  as  interconnects  operating  over  a  broad 
frequency  range  often  contain  a  large  number  of  poles  and  the  amplitude  variation  of  the 
frequency  response  could  be  large.  This  can  create  numerical  problems  in  (18)  since  the 
matrix  [A]^  [A]  can  become  an  ill-conditioned  matrix.  This  is  apparent  in  (1)  where  the 
power  series  expansion  can  have  a  large  d3aiamic  range.  For  instance,  if  the  frequency 
response  ranging  from  DC  to  2  GHz  needs  to  be  approximated  using  20  complex 
conjugate  poles  and  2  real  poles,  the  dynamic  range  of  elements  in  the  matrix  [A]  is  from 
1  to  (27r  X  2  X  10^)^^,  which  causes  the  matrix  to  become  ill-conditioned.  This  problem 
can  be  improved  using  frequency  scaling  and  the  computed  poles  and  residues  can  be 
reconstructed  using  the  scaling  factor  Uo-  Equation  (13)  can  be  written  in  the  form 


However,  it  has  been  shown  that  the  scaling  factor  in  (19)  for  approximating  the 
frequency  response  does  not  result  in  significant  improvement  in  the  approximation 
beyond  20-30  poles.  BEMP  alleviates  this  ill-conditioned  matrix  problem  by  dividing  the 
system  frequency  response  into  various  frequency  subbands  and  then  processing  each 
subband  individually.  As  shown  in  (11),  the  rational  function  matrix  [i/(5)]  is  represented 
in  the  pole-residue  form  as  a  summation  of  sub-networks.  Both  the  stability  and  passivity 
conditions  in  each  sub-network  are  satisfied  using  stability  constraints  and  multi-port 
passivity  formulae  described  earlier.  Since  stability  constraints  and  passivity  formulae  are 
only  enforced  on  each  sub-network,  there  is  no  relationship  for  the  stability  and  passivity 
conditions  between  sub-networks  except  that  they  contribute  to  the  overall  response  of 
the  macro-models.  This  enables  the  entire  frequency  response  to  be  divided  into  sub¬ 
frequency  bands  (or  subbands).  The  basic  idea  for  the  construction  of  broadband  macro¬ 
models  is  that  if  complex  conjugate  poles  and  real  poles  can  be  extracted  from  a  localized 
region  of  the  frequency  response,  then  the  original  frequency  response  can  be  divided 
into  subbands,  as  shown  in  Fig.s  3.3(a)  and  (b). 


(c)  (d) 

Fig.  3.3:  (a),  (b)  Band  division,  (c)  Subband  reordering,  and  (d) 
Subband  dilation 
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In  Fig.s  3.3(a)  and  (b),  poles  from  the  localized  region  within  a  subband  can  be  extracted. 
The  process  of  dividing  the  entire  frequency  band  of  interest  into  smaller  subbands  is 
called  as  band  division.  The  subbands  are  either  of  uniform  or  non-uniform  width  and 
contain  a  set  of  sampled  frequency  data,  as  shown  in  the  Fig..  The  criterion  for  choosing 
the  width  of  each  subband  depends  on  the  nature  of  the  frequency  response,  but  it  is 
desirable  to  divide  the  entire  frequency  band  into  subbands  where  resonant  peaks  exist.  In 
areas  where  no  resonant  peaks  exist,  the  subbands  are  overlapped  to  minimize  the  number 
of  subbands  required.  The  orders  NS  and  DS  for  each  subband  can  be  estimated  using  the 
minimum  eigenvalue  tracking  method  or  the  number  of  resonant  peaks.  Using  the  band 
division  method,  the  macro-models  for  each  subband  can  be  extracted  in  parallel. 
However,  macro-models  from  each  subband  can  interact  with  each  other  since  the 
extrapolation  of  each  macromodel  outside  the  subband  of  interest  can  produce  a  non-zero 
frequency  response.  This  interaction  can  result  in  an  erroneous  frequency  response  when 
the  macro-models  are  combined.  During  the  least  squares  approximation  process 
discussed  earlier,  poles  located  outside  the  subband  can  be  frequently  found  since  there 
are  no  constraints  that  limit  the  position  of  the  poles.  After  removing  the  unstable  poles, 
stable  poles  located  outside  the  subband  have  to  be  handled  since  they  amplify  the 
interactions  between  subbands.  Although  these  poles  increase  the  accuracy  in  the 
subband  of  interest,  they  usually  reduce  the  accuracy  in  adjacent  frequency  bands. 
Therefore,  selectors  are  used  to  restrict  the  poles  within  the  frequency  band  of  interest 
during  the  generation  of  macro-models  in  each  subband.  This  ensures  smooth 
extrapolation  and  the  absence  of  resonant  peaks  in  adjacent  bands.  BEMP  implements 
four  types  of  selectors,  namely,  low  pass,  band  pass,  high  pass  and  band  reject  selectors. 
To  compensate  for  the  inaccuracy  in  the  subbands  after  the  removal  of  poles  outside  the 
subbands,  three  methods  namely,  subband  reordering,  subband  dilation  and  pole 
replacement  methods  have  been  implemented  in  BEMP. 

In  subband  reordering  method,  the  macro-models  for  each  subband  are 
constructed  after  subtracting  the  frequency  response  of  the  previous  macro-models  from 
the  original  frequency  response.  As  discussed  earlier,  selectors  are  applied  to  handle  the 
poles  located  inside  or  outside  the  frequency  band.  At  every  stage  of  this  process,  the 
subbands  are  reordered  based  on  the  magnitude  of  the  frequency  response  in  decreasing 
order.  The  macromodel  for  the  subband  with  the  largest  magnitude  is  always  constructed 
first,  prior  to  the  other  subbands.  This  is  shown  in  Fig.  3.3(c),  where  subbands  2  and  3  in 
Fig.  3.3(a)  are  interchanged  during  the  macromodel  construction  process.  Subband 
reordering  reduces  numerical  errors  caused  by  dominant  poles  in  an  adjacent  subband. 

In  subband  dilation  method,  the  subbands  are  dilated  to  provide  local  correction  in 
the  region  between  adjacent  subbands.  This  method  results  in  an  overlap  between 
subbands,  as  shown  in  Fig.  3.3(d).  In  the  Fig.,  subband  ’n’  is  dilated  to  overlap  subbands 
’n-1’  and  ’n+1’.  The  amount  of  overlap  is  determined  by  the  position  of  poles  in  each 
subband.  If  the  poles  are  located  at  the  boundary  between  bands,  then  these  poles  have 
the  maximum  effect  on  both  frequency  bands.  As  the  poles  are  located  farther  away  from 
the  boundary,  the  effect  of  these  poles  on  adjacent  bands  is  minimized.  Using  this 
criterion,  the  subbands  are  suitably  dilated  such  that  the  poles  at  the  boundary  lie  in  the 
overlap  region.  The  macro-models  of  three  subbands  are  then  iteratively  corrected  by 
monitoring  the  error  in  three  subbands  using  the  pole  replacement  method  discussed  later. 
Though  subband  reordering  and  subband  dilation  methods  minimize  the  interaction 
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between  subbands,  correction  may  once  again  be  necessary  since  these  methods  may 
sometimes  miss  the  poles  at  the  boundary  between  subbands  or  generate  spurious  poles 
within  the  subband. 

BEMP  uses  a  pole  replacement  method  to  improve  the  accuracy  of  the 
constructed  macro-models  by  discarding  spurious  poles  and  extracting  accurate  poles,  as 
shown  in  Fig.  3.4. 


dr  Compute  error  and  location 

£mo  1 


Lmo  J  \[Hm(s)]\ 


RMSn  =  { [ife(^)]-  [/TkC^)]  } 


Determine  the  subbaud  bandnidth 
and  collect  poles  and  residues 

Subband 

( 

} 

s-p„^  ^-p„.vj 

Calculate  the  error  RMSe  in  the  subband 

RMSe  =  { [/fo(s)]-  [iyw(5)] } 

[jy5(5)] = I  ^ .  1  1 

U-Pml  S-Pm2  S-p„y\ 

Calculate  the  difference  [He(s)]  in  the  subband 

k  [Hm(s)]=[Hmis)]-[ms)] 

Fig.  3.4:  Pole  Replacement 


The  pole  replacement  method  is  applied  after  the  macro-models  for  each  subband  are 
constructed.  The  details  of  the  pole  replacement  method  are  described  below: 

1)  The  process  begins  with  the  comparison  between  the  frequency  response  [Hm{s)'\ 
from  the  macro-models  and  the  original  frequency  response  [iTo(5)]. 

2)  Based  on  the  error  criterion,  the  maximum  difference  value  Emo  and  location 
Lmo  between  [//o(5)]  and  [//m(s)]  are  calculated  and  stored. 

3)  The  algorithm  searches  for  a  set  of  poles  around  the  location  Lmo  and  determines 
the  width  of  the  subband  to  be  recalculated. 

4)  The  root-mean-square  error  RMSe  between  [i/o(5)]  and  [Hm{sy\  in  the  subband  is 
calculated  and  stored.  The  frequency  response  associated  with  the  poles  in  the 
subband  is  then  subtracted  from  [Hmis)]  in  order  to  recalculate  the  frequency 
response  of  the  subband.  Then,  the  difference  between  [i/oCi')]  and  [//w(5)]  is 
calculated  and  stored  in  [i/e(5)]. 

5)  Using  [//e(s')],  new  poles  and  residues  are  recalculated  in  the  subband. 

6)  The  frequency  response  using  new  poles  and  residues  [Hn{s)]  is  computed  and  the 
root-mean-square  error  RMSn  between  [He{s)\  and  [//«(5)]  in  the  subband  is  again 
calculated  and  stored. 
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7)  IfRMSn  <  RMSe,  then  old  poles  and  residues  are  replaced  with  new  poles  and 

residues.  IfRMSn  >  RMSe,  then  old  poles  and  residues  are  retained. 

8)  Steps  1  to  7  are  repeated  until  the  error  is  minimized. 

The  location  and  size  of  each  subband  is  recalculated  iteratively  in  the  pole 
replacement  method.  The  error  criterion  determines  the  maximum  error  value  Emo,  the 
location  Lmo,  and  the  width  of  the  subband  being  recalculated.  BEMP  uses  the  root- 
mean-square  error  as  the  error  criterion  while  the  subband  width  around  the  location  Lmo 
is  determined  based  on  the  position  of  poles  being  replaced.  Assuming  the  number  of 
poles  being  replaced  is  k,  the  algorithm  searches  for  (k  +2)  poles  around  the  location 
Lmo.  The  {k  +2)  poles  are  then  reordered  in  increasing  frequency  such  that  pole  p{\) 
corresponds  to  the  lowest  frequency  and  pole  p{n)  corresponds  to  the  highest  frequency. 
The  left  and  right  boundaries  of  the  subband  are  then  determined  as  the  mid-frequency 
points  between  poles  p{\),p{l)  and  p{n-\),p{n),  respectively.  If  the  width  and  position  of 
the  subband  remains  the  same  as  before,  then  the  subband  is  suitably  dilated  to  minimize 
error.  If  the  location  Lmo  is  not  changed  after  dilating  the  subband,  then  the  location  Lmo 
is  stored  and  ignored  in  subsequent  iterations.  With  the  use  of  band  division,  subband 
reordering  and  subband  dilation  methods  initially,  the  required  number  of  iterations  for 
the  pole  replacement  method  locally  is  minimum.  Even  though  the  pole  replacement 
method  was  originally  intended  for  increasing  the  accuracy  of  poles  and  residues  of 
passive  macro-models,  it  can  be  used  to  compensate  for  negative  eigenvalues  by  inserting 
additional  poles  and  residues,  as  discussed  earlier.  After  the  poles  and  residues  in  the 
entire  frequency  band  or  subbands  are  calculated,  the  residue  matrices  [6]  and  [rj]  can  be 
calculated.  It  is  important  to  note  that  broadband  macro-models  can  be  constructed  using 
band  division,  subband  reordering,  subband  dilation,  and  pole  replacement  methods  along 
with  frequency  scaling,  without  having  an  ill-conditioned  matrix  problem.  In  addition, 
since  the  number  of  required  poles  is  reduced  and  the  orders  NS  and  DS  become  small 
within  a  subband,  the  size  of  the  matrix  {Af  [A\  becomes  small  and  the  required 
computational  memory  and  CPU  time  can  be  reduced. 

3.1.5  Network  Synthesis 

Using  the  pole-residue  representation  of  the  rational  function  representing  the 
admittance  parameters,  electrical  networks  consisting  of  resistors,  inductors,  capacitors, 
and  controlled  sources  can  be  constructed.  In  (5)  and  (II),  RLC  networks  can  be  used  to 
represent  complex  conjugate  poles  and  residues  of  the  band  pass  filter,  RL  networks  can 
be  used  to  represent  real  poles  and  residues  of  the  low  pass  filter,  and  RC  networks  can 
be  used  to  represent  real  poles  and  residues  of  the  high  pass  filter.  A  resistor  and  a 
capacitor  can  be  used  to  represent  the  residues  of  the  all  pass  filter.  The  electrical  network 
configurations  are  shown  in  Fig.  3.5  and  the  values  of  the  components  are  shown  in  Fig. 
3.6.  It  is  important  to  note  that  the  frequency  scaling  in  (19)  and  the  loeal  ground  for 
SPICE  sub-circuits  have  been  used  in  the  circuit  implementation. 
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Fig.  3.5:  Electrical  network  configurations  for  macromodels 
using  admittance  parameters 


Fig.  3.6:  The  values  of  elements  (electrical  representations  of 
macro-models) 


3.2  Broadband  Efficient  Macro-modeling  Program  (BEMP) 

3.2.1  BEMP  version  3.0 

The  Broadband  Efficient  Macro-modeling  Program  (BEMP)  incorporates  the  entire 
modeling  process  described  in  the  previous  section.  BEMP  has  been  coded  using  the  C++ 
programming  language  and  is  executable  on  the  Windows  platform.  The  program  inputs 
S,  Y,  or  Z  parameter  representations  of  multi-port  networks  and  generates  SPICE  sub- 
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circuit  files  that  can  be  directly  incorporated  into  a  SPICE  simulation.  The  Touchstone 
format  is  used  to  input  data  into  BEMP.  This  format  has  been  chosen  so  that  both 
measurement  and  simulation  data  can  be  easily  and  efficiently  modeled  using  BEMP. 
Most  field  solvers  and  all  vector  network  analyzers  have  the  option  of  storing  data  in  the 
Touchstone  format.  Details  on  the  format  are  available  in  the  BEMP  manual. 

The  input  data  in  Touchstone  format  when  read  by  BEMP  is  first  converted  to  a  Y- 
parameter  representation  through  simple  matrix  operations.  All  macro-modeling 
techniques  described  in  the  previous  section  are  then  performed  on  the  Y-parameters.  Fig. 
3.7  illustrates  how  BEMP  goes  about  generating  a  SPICE  net  list  form  an  input 
Touchstone  file  containing  multi-port  frequency  response  data  for  a  passive  network.  The 
Fig.  begins  with  explaining  the  algorithm  after  the  input  data  has  been  converted  to  the 
Y-parameter  format. 


INITLiUZiTION : 

01  load  original  admittance  parameters  Ho  and  the  frequency/o 
02  load  suWtands  FSS  (Band  Division) 

03  load  estimated  orders  NS  and  DS 
04  initialize  poles  and  residues 
LOCAL  LOOP  : 

05  for  n=l  to  the  mimber  of  subbands 

06  set  sub-computational  domain  SCD  =  FS9(n)  (Subband  Recffdering) 

07  apply  the  band-rqection  selector  on  poles  and  resides 

08  calculate  the  frequenq'  response  Hm  using  poles  and  residues  over  the  SCD 

09  calculate  He=Ho-  Hm  and  save  Hg = He  and  maxHg  over  the  SCD 

10  initialize  good-sub-poles  and  good-sub-residues 

11  for  num,  den  =  a  range  of  estimated  orders  AtS  and  2W  ([S]) 

12  initialize  sub-poles  and  sub-residues 

13  do  frequency  scaling  on  fo  (Frequency  Scaling) 

14  calculate  the  denominator  coefficients  using  the  eigenvalue  mefliod  ([8]) 

15  construct  stable  sub-poles  from  the  denominator  coefficients 

16  undo  frequency  scaling  on  stable  sub-poles 

17  calculate  sub-residue  matrices  of  sub-poles 

18  calcuLite  the  residues  [ ^  and  [t]J 

19  apply  multiport  passivity  fbnimlae 

20  if  toere  is  a  violation,  compensate  negative  eigenvahies.  end 

21  calculate  the  frequency  response  Hm  using  sub-poles  and  sub-residues 

22  calculate  Hs=  He-  Hm  over  the  SCD 

23  if  max£rg>maxfls  based  on  error  criterion 

24  save  msxHg  =  maxHs 

25  save  good-sub-poles  =  sub-poles,  good-sub-residues  =  sub-residues 

26  end 

27  appfr  appropriate  selectors  on  good-sub-poles  and  good-sub-residues 

28  add  good-siib-poles  and  good-sub-residues  to  poles  and  residues 

29  end 

30  apply  subband  dilation  and  pole  replacement  methods  over  the  SCD 

31  end 
GLOBAL  LOOP  : 

32  for  n=l  to  the  monber  of  global  optimization 

33  applj'  the  pole  replacement  method  over  the  entire  frequency  band 

34  update  the  residues  [Sj  and  [t]] 

35  end 

36  generate  SPICE  net  lists 


Fig.  3.7  A  flow  chart  of  BEMP 
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The  application  of  the  modeling  process  in  BEMP  as  described  previously  is  graphically 
illustrated  in  Fig.  3.8.  The  exact  poles  of  the  network  are  shown  at  the  top  of  the  Fig..  It 
has  been  assumed  that  all  the  poles  are  complex  conjugate  poles.  In  Fig.  3.8,  the  good 
poles  are  the  exact  poles  of  the  network.  All  poles  except  the  good  poles  are  regarded  as 
spurious  poles.  Using  the  band  division  method,  the  entire  frequency  response  has  been 
divided  into  7  non-uniform  subbands  such  that  each  subband  has  between  one  to  four 
resonant  peaks.  The  bands  are  numbered  from  1-7  horizontally  and  named  A  -  G 
vertically.  This  has  been  done  intentionally  to  differentiate  band  division  from  subband 
reordering.  The  steps  illustrated  below  assume  that  after  the  extraction  of  the  poles  from 
each  subband,  the  corresponding  residues  are  extracted.  The  frequency  response  of  the 
subband  macromodel  is  then  subtracted  from  the  overall  response  prior  to  the 
macromodel  construction  for  the  next  subband.  The  various  steps  are  described  below; 


Exact 

poUs 


Band  Division 


Fig.  3.8  Illustration  of  band  division  and  sub-band 
reordering 

1)  During  the  construction  of  sub-macro-models  from  subband  A,  three  good  poles 
and  a  spurious  pole  located  outside  subband  A  are  extracted  using  the  eigenvalue 
method  discussed  earlier.  After  applying  the  low  pass  selector  with  bandwidth 
equal  to  subband  1 ,  three  good  poles  are  extracted. 

2)  After  subtracting  the  frequency  response  of  the  above  macromodel  from  the 
overall  response,  a  spurious  pole  located  outside  subband  B  and  two  good  poles 
are  extracted  from  subband  B.  After  applying  the  band  pass  selector  on  subband 
B,  the  spurious  pole  is  removed  and  two  good  poles  are  retrieved.  Note  that  there 
is  an  exact  pole  near  the  boundary  between  subbands  B  and  C,  which  is  not 
included. 

3)  During  the  calculation  in  subband  C,  a  spurious  pole  located  outside  subband  C 
and  two  good  poles  are  extracted.  After  applying  the  band  pass  selector,  two  good 
poles  are  retained.  Up  to  this  point,  seven  good  poles  have  been  extracted  and  an 
exact  pole  near  the  boundary  of  subbands  B  and  C  has  been  missed. 
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4)  The  next  computational  domain  moves  to  subband  D  corresponding  to  subband  5 
instead  of  subband  4  (subband  reordering).  This  is  because  the  magnitude  of  the 
frequency  response  is  larger  in  subband  5  than  in  subband  4.  Sub-macro-models 
constructed  from  subband  D  result  in  two  spurious  poles  and  two  good  poles. 
After  applying  the  band  pass  selector  to  remove  poles  located  outside  subband  D, 
a  spurious  pole  and  two  good  poles  are  extracted. 

5)  From  subband  E  corresponding  to  subband  4,  a  spurious  pole  and  two  good  poles 
are  extracted.  After  applying  the  band  pass  selector,  a  spurious  pole  and  two  good 
poles  are  retained. 

6)  From  subband  F  corresponding  to  subband  6,  two  spurious  poles  located  outside 
subband  F  and  two  good  poles  are  extracted.  After  applying  the  band  pass 
selector,  two  good  poles  are  retained. 

7)  From  subband  G,  a  spurious  pole  and  a  good  pole  are  extracted  after  applying  the 
high  pass  selector.  After  collecting  sub-macro-models  from  each  subband,  14 
good  poles  and  3  spurious  poles  are  found  using  the  band  division  and  subband 
reordering  methods.  The  pole  replacement  method  is  now  applied  to  correct  the 
good  poles  and  eliminate  the  spurious  poles. 

3.2.2  Additional  Features:  Enforcement  of  causality 

The  macro-models  obtained  using  BEMP  are  guaranteed  to  be  causal  and  stable 
over  all  frequencies.  However,  for  timing  analysis  in  high-speed  digital  design,  especially 
involving  long  delay  interconnects;  causality  of  the  macro-models  is  a  major  issue. 
Causality  deals  with  the  precise  timing  of  signal  propagation  through  passive  structures 
like  interconnects  and  power  distribution  networks.  If  unaccounted  for  it  can  lead  to 
considerable  error  in  the  signal  integrity  analysis  of  networks.  Hence  the  macro-models 
obtained  using  BEMP  need  to  be  made  causal.  This  work  on  enforcing  causality  is  still  in 
progress  and  is  documented  in  the  following  sections.  The  method  described  in  this 
section  will  be  implemented  in  BEMP  v4.0. 

3.2.2. 1  Delay  extraction  from  frequency  domain  data 

With  increasing  clock  frequencies  in  digital  and  mixed-signal  systems,  the  size  of 
the  passive  structures  is  becoming  comparable  to  the  signal  wavelength  at  the  operating 
frequency,  leading  to  distributed  effects  like  delay  playing  an  important  role  in  time 
domain  analysis.  These  distributed  effects  imply  that  there  are  many  causality  conditions 
that  need  to  be  satisfied  by  a  passive  macro-model,  to  generate  the  correct  signal  response 
in  the  time  domain.  These  causality  conditions  depend  on  the  delay  in  the  passive 
network,  which  is  the  time  taken  by  the  signal  to  travel  from  one  port  to  another.  Hence 
determining  this  delay  from  the  frequency  domain  response  is  an  important  step  towards 
causal  macro-modeling. 

Using  the  minimum-phase  property  of  the  response  functions  of  passive  networks, 
a  transfer  impedance  response  can  be  separated  into  a  minimum  phase  function 

and  an  all-pass  function  as  follows  [4]: 

\Zl2'{juj)\  =  \Zl2Uu)\  (20) 
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aig[Zl2'0'w)]  =  -ip  j  \o^\Z\2{j9)\ cot  (21) 

Td_  (23) 

U) 

where  Z12’(ju)  is  the  minimum  phase  portion  of  Z12(jo))  and  ZHapO^)  is  the  all-pass 
portion  of  Z12(ju).  The  negative  gradient  of  the  phase  response  of  Z12ap0'u)  gives  the 
delay  embedded  in  the  response  Z12Qui).  This  methodology  can  be  used  to  extract  delay 
from  the  S,  Y  and  Z  parameter  representations  of  passive  networks. 

Mixed-mode  passive  structures  like  differential  transmission  lines  are 
characterized  by  two  types  of  delays:  differential  mode  (odd  mode)  delay  and  common 
mode  (even  mode)  delay.  The  technique  discussed  above  can  be  easily  extended  to 
extract  the  even  and  odd  mode  delays  in  mixed-mode  systems.  The  process  involves 
transforming  the  given  system  parameters  into  mixed-mode  parameters  and  applying  the 
above  technique  on  the  new  set  of  mixed  mode  parameters. 

3.2.2.2  Causality  Compensation  and  Results 

The  delay  extracted  from  the  frequency  response  can  be  used  for  enforcing 
causality  on  the  system  response.  To  compensate  a  frequency  response  for  causality,  the 
following  steps  are  carried  out 

1.  The  frequency  response  is  converted  into  the  time  domain. 

2.  The  delay  in  the  response  is  extracted  as  explained  in  the  previous  section. 

3.  The  extracted  delay  is  used  to  force  the  time  domain  response  to  zero  over  the 
delay  period. 

4.  The  signal  energy  in  the  response  is  adjusted  to  account  for  the  loss  due  to  step  3. 
This  ensures  correct  steady  state  response. 

This  has  been  demonstrated  through  the  simulation  of  a  long  transmission  system.  A 
microstrip  transmission  line  50cm  in  length  and  with  a  characteristic  impedance  of  5012 
was  constructed  in  Agilent’s  Advanced  Design  System  (ADS).  The  line  was  terminated 
with  a  5012  load  at  the  far  end  and  was  excited  at  the  near  end  with  a  250mV/100ps  step 
source  having  5012  input  impedance.  Fig.  3.9  shows  the  response  obtained  at  the  far  end 
with  and  without  causality  compensation  enforced  on  the  transmission  line  macro-model. 
The  response  without  causality  compensation  shows  a  violation  of  causality. 
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Fig.  3.9  Causality  compensation 

Next  the  transmission  line  was  excited  by  a  random  bit-pattern  and  an  eye-diagram  was 
obtained  for  the  far  end  response.  Fig.s  3.10(a)  and  (b)  show  the  eye-diagrams  obtained 
with  and  without  causality  compensated  frequency  domain  data. 


Eye  Diagram 


Eye  Diagram 
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Fig.  3.10  Eye-diagrams  using  uncompensated  (a)  and  causality  compensated  (b)  macro-models 


3.2.3  Additional  Features:  Nonlinear  Macro-modeling  Of  I/O  Drivers  And 
Receivers 

Signal  integrity  (SI)  and  timing  analysis  of  large  digital  systems  in  today’s  world  is 
becoming  more  and  more  complex  both  in  terms  of  CPU  memory  required  and  time 
consumed  for  simulation.  One  method  to  reduce  this  complexity  is  to  use  macro-models 
of  digital  driver  and  receiver  circuits  constituting  these  large  digital  systems.  Accurate 
modeling  of  the  digital  drivers  to  capture  their  non-linearity  is  a  big  challenge.  Black-box 
modeling  techniques  are  helpful  in  modeling  digital  driver  circuits  since  they  are 
independent  of  the  knowledge  of  the  internal  logic  of  the  driver  [4].  Fig.  3.11  shows  a  2- 
port  black-box  circuit  model  of  a  digital  buffer  with  information  about  its  input  and 
output  voltage-current  characteristics  and  no  information  about  its  internal  circuitry. 
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Fig.  3.11.  Black-Box  representation  of  a  driver  circuit 

In  this  report,  a  modeling  approach  using  spline  functions  with  finite  time 
difference  approximation  is  used  for  modeling  moderately  non-linear  digital  driver 
circuits.  To  estimate  power  supply  noise  and  cross-talk  accurately,  the  method  has  been 
extended  to  multiple  ports.  However,  for  highly  non-linear  circuits,  this  method  has 
problems  with  accuracy.  To  alleviate  this  problem,  a  method  based  on  Recurrent  Neural 
Networks  (RNN)  has  been  developed.  Also  in  this  report,  a  methodology  based  on  RNN 
and  static  characteristics  with  finite  time  delay  element  is  used  to  macro-model  receiver 
circuits.  The  accuracy  and  speed-up  of  the  macro-models  has  been  compared  against  the 
transistor  level  circuit  models  on  some  test  cases.  The  non-linear  macro-models  is  an 
extension  of  the  linear  macro-models  described  in  the  previous  section  and  will  be 
integrated  in  BEMP  v4.0. 

3.2.3.1  Spline  Function  With  Finite  Time  DifYerence  Approximation 
Modeiing  Technique 

The  output  current  of  any  driver  can  be  expressed  as  a  function  of  the  output  voltage 
using  static  characteristics.  Let  sub-models  fis  and  f2s  represent  the  static  characteristic 
relation  between  the  driver  output  current  and  the  driver  output  voltage  for  a  driver  when 
the  driver  input  is  set  HIGH  and  LOW  respectively,  as  shown  in  equation  (24) 

fsn  (^)  =  (*)  +  '  (^)  +  -•  +  ^«0  >  ^=1.2;  Ttl  >1  (24) 

where.  As  are  constants,  Vg  is  the  driver  output  voltage  and  the  value  of  m  is  usually  less 
than  5.  The  static  sub-models  fis  and  f2s  can  be  estimated  by  conducting  a  dc  sweep  at  the 
output  of  the  driver  for  both  driver  inputs  HIGH  and  LOW  respectively.  Static 
characteristic  models  do  not  however  capture  the  nonlinearity  associated  with  the 
dynamic  response  of  the  driver.  Since  static  characteristic  relations  fail  in  eapturing  the 
dynamic  characteristics,  static  sub-models  in  (24)  can  be  modified  to  include  the  previous 
time  instances  of  the  driver  output  current  so  that  dynamic  behavior  of  the  driver  can  be 
captured. 

When  the  driver  input  is  set  HIGH,  the  current  at  the  output,  igh,  can  be  expressed  as 
sub-model  fis  as  shown  in  equation  (25) 

fu  (^)  =  ^oh  W  =  (^)'"  +  + . +  4o  (25) 

Sub-model  fu  at  time  instance  ‘k-1  ’  can  be  expressed  as  equation  (26) 
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(26) 


fu  (^  - 1)  =  ioh  “  1)  =  ^\m^o  -  o'"  +  ^i(„-i)Vo  i.k  - 1)'”"'  + . +  A, 

Incremental  change  in  the  driver  output  current  Moh  is  the  difference  between  the  present 
instance  {k)  and  previous  time  instance  {k-1)  values  of  sub-model  fis  as  shown  in 
equations  (27)  and  (28) 


fu  ik)  1)  =  ioh  (^)  -  ioh  -  0  = 


(27) 


(Or)  4(0-/„(?-A0  = 

Once  Moh  is  calculated,  first  derivative  of  driver  output  current  ioh'  can 
approximated  using  equation  (29)  as 

fu  (0-/b  ^ 

A.  A.  ^  oh 


(29) 


where,  At  is  the  sampling  time.  The  effect  of  dynamic  behavior  is  captured  in  ioh', 
similarly  the  effect  of  dynamic  behavior  for  f2  is  captured  by  A/’o/'.  Therefore,  static  sub¬ 
models  fis  and  f2s  can  be  modified  as 

finik)=fu(k)  +  P*i'oh 
f2Ak)  =  f2s(k)  +  PP*i'ol 

where  p  and  pp  are  constants  whose  magnitude  can  be  estimated  by  calculating  the  error 
between  fis/fis  and  transistor  level  driver  output  current  values  for  inputs  HIGH/LOW 
respectively.  It  is  important  to  note  that  there  is  no  limitation  on  the  number  of  previous 
output  current  time  instances  in  the  static  sub-models  fis/f2  [5]. 

Once  fin  and  are  estimated  for  input  HIGH  and  LOW  respectively,  the  relation 
between  driver  output  current  and  voltage  can  be  expressed  as  shown  in  equation  (31) 


G  (^)  =  Wl  (^)/l  A)  +  ^2  (*)/2  W  (31) 

where,  wj  and  W2  are  the  weighting  functions  that  help  fj  and  transition  from  one  state 
to  another.  Since,  two  unknowns  exist,  weighting  functions  wi  and  W2  can  be  found  by 
linear  inversion  of  (31)  for  two  different  loads. 


3.2.3.2  Extension  To  Multiple  Ports 

To  incorporate  the  effect  of  the  power  supply  node  vm,  a  new  relation  should  be 
drawn  between  driver  power  supply  current  (/</</)  and  driver  power  supply  voltage  [6]. 
However,  the  driver  power  supply  current  is  not  only  a  function  of  driver  power  supply 
voltage  but  also  a  function  of  driver  output  voltage  (vo),  as  shown  in  equation  (32) 

idd  (k)  =  WiM  {k)fdd  (Vo  {k),  (*))  +  ^2dd  (k)f2dd  (k),  (^))  +  (32) 

where,  fidd  and  f2dd  are  the  power  supply  sub-models  that  relate  the  power  supply  current 
to  power  supply  voltage  for  input  HIGH  and  LOW  respectively.  In  (32),  weighting 
functions  widd,  W2dd,  and  w^dd  help  sub-models  fdd  and  f2dd  in  transitioning  from  one  state 
to  another.  Sub-models  fdd  and  fdd  can  be  expressed  as  a  combination  of  static  sub¬ 
models  {fisd  and  fsd)  and  dynamic  sub-models  {fidy  and  fdy),  as  shown  in  equation  (33) 

fndd  Ao  ik))  =  f„sd  Ao  {k),  Vm  (k))  +  f„dy  (Vo  (k),  Vrfrf  (k)) ;  where,  n  =  1,2  (33) 
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The  static  sub-models  fud  and  f2sd  are  calculated  through  a  double  dc  sweep  at  driver 
output  and  at  driver  power  supply.  This  relationship  is  shown  in  equation  (34)  as: 

f„sd  (Vo  (^))  =  KrVo''  (A:)  + ...  +  A„„  (34) 

where,  n  =  l,2;r  >=l;s  >=1; 

The  dynamic  sub-models  fidy  and  f2dy  are  expressed  in  equation  (35)  as: 

f ndyiyoi^y)  ~  Pdn^^on  '^^dn^^ddn  '^PPdn^on  "^^^dn^^ddn  ^  ~  h2  (35) 

In  equation  (35),  dynamic  characteristic  sub-model  (fddx)  constants  pdn,  qdn,  ppdn  and 
qqd„  are  estimated  by  connecting  PWL  voltage  sources  both  at  driver  output  and  at 
driver  power  supply  and  measuring  the  error  between  static  power  supply  current  and 
transistor  level  driver  power  supply  current.  In  equations  (34)  and  (35),  A„,  pdn,  qdn,  PPdn 
and  qqdn  are  all  constants  and  depend  on  the  driver  being  modeled.  Terminating  the 
driver  with  2  different  loads  result  in  two  equations  of  (31)  and  since,  three  unknown 
weighting  functions  exist,  one  method  for  solving  the  problem  is  by  assuming  =  (1  - 
W2d)-  Thus,  weighting  functions  wid,  W2d  and  wm  can  be  calculated  once  fid  and  f2d  are 
estimated  for  2  different  loads.  Similarly,  driver  output  current  is  not  only  a  function  of 
driver  output  voltage  but  also  a  function  of  driver  supply  voltage.  The  procedure  for 
estimating  driver  output  current  is  similar  to  estimating  driver  power  supply  current. 

Test  case  1:  A  test  case  is  generated  where  an  IBM  driver  (HSTL_B)  was  connected  to  a 
plane  pair  which  was  modeled  using  the  cavity  resonator  method.  The  plane  pair  was  10 
cm  X  6  cm  in  length  and  width.  It  had  5  ports  on  each  plane,  Vdd  and  Gnd  as  shown  in 
Fig.  3.12.  Ten  drivers  were  connected  at  port  1  and  all  the  drivers  were  connected  to  25 
ohm  transmission  lines  which  in  turn  were  terminated  using  1  pF  capacitance.  All 
transmission  lines  were  terminated  at  port  3.  The  power  supply  {vdd)  was  at  port  4.  Two 
additional  ports  were  used  for  probing.  All  10  drivers  were  identical.  The  resulting 
simultaneous  switching  noise  (SSN)  was  calculated  using  both  actual  transistor  level 
driver  model  and  spline  function  with  finite  time  difference  approximation  model.  Fig. 
3.13(a)  shows  the  near  end  and  far  end  voltage  waveforms  on  transmission  line  1.  It  can 
be  clearly  seen  that  the  modeled  and  the  transistor  level  model  waveforms  match 
accurately.  A  third  order  spline  function  with  two  previous  time  instances  was  used  to 
model  sub-models  //  and  fdd-  Fig.  3.13(b)  shows  the  SSN  at  ports  1,  3  and  5  respectively 
both  for  modeled  and  actual  simulated  cases  and  again  it  can  be  clearly  seen  that  the 
noise  is  estimated  accurately  using  the  spline  function  with  finite  time  difference 
approximation  method.  Spline  function  model  is  8  times  faster  than  the  IBM  transistor 
level  driver  model. 
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Fig.  3.12.  Plane  pair  model  generated  using  cavity  resonator  method.  Both  planes  have  5  ports  each. 


Fig.  3.13(a).  Near  end  (V_ne(t))  and  far  end  (V_fe(t))  voltage  waveforms  on  transmission  line  #  1  for  IBM 
transistor  level  driver  model  (straight  line)  and  spline  function  with  finite  time  difference  approximation 

model  (dotted  line). 
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Fig.  3.13(b)  Simultaneous  Switching  Noise  (SSN)  at  ports  pi,  p2  and  p5  when  10  identical  drivers  are 
switching  together.  SSN  from  actual  transistor  level  driver  model  (straight  line)  and  spline  function  with 
finite  time  difference  approximation  model  (dotted  line). 
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3.2.3.3 


Recurrent  Neural  Network  Modeling  Technique 


Spline  function  with  finite  time  difference  approximation  has  some  limitations. 
When  the  transistor  level  driver  circuit  models  are  highly  complex,  the  proposed  method 
cannot  capture  the  high  non-linearity  present  in  the  driver.  One  solution  for  modeling 
these  highly  non-linear  driver  circuits  is  by  using  Recurrent  Neural  Networks  (RNN). 
Sub-functions  fi  and  f2  are  now  expressed  using  RNN  functions  as  shown  in  equations 
(36)  and  (37) 


M  N 

fn  =  « ji^i  +  ^oj )  +  (36) 

g{u)  =  {e^  +  e~^  )  l{e^  -  e~^ )  (37) 

where,  b  and  a  are  weights  associated  with  the  neural  network,  N  represents  number  of 
hidden  neurons,  M  represents  number  of  outputs  and  x  is  the  regressor  vector  that  takes 
into  account  both  the  past  and  present  samples  of  driver  output  current,  output  voltage, 
power  supply  current  and  power  supply  voltage  as  shown  in  (38). 


Vo{k),v„{k  -  \\...,v„(k  - r) 
idd  (k  -  i),idd  (k  -  2),...,/rfrf  (k  -  r) 
.  VM{k),Vjj{k  -\),...,vjd{k  -r) 


(38) 


A  RNN  is  a  special  type  of  neural  network  having  the  capability  of  learning  and 
then  representing  dynamic  system  behavior.  RNN  has  been  used  in  areas  such  as  control, 
signal  processing  and  system  identification.  A  training  method  based  on  back  propagation 
through  time  (BPTT)  is  used  for  training  the  RNN. 

Test  Case  2:  A  test  case  with  four  IBM  drivers  (‘DDR2’)  connected  to  a  small  plane  pair 
modeled  using  cavity  resonator  method  was  generated.  IBM  ‘DDR2’  driver  has  a  power 
supply  voltage  of  2.5  volts  and  it  is  usually  driven  at  250  MHz  with  a  rise  time  of  1 .25  ns. 
The  pane  pair  has  a  dimension  of  6  cm  X  4  cm  with  4  ports  on  Vdd  plane  and  4  ports  on 
Gnd  plane.  All  the  drivers  were  identical,  driving  a  50  ohm  transmission  line  and  were 
connected  at  port  1.  The  power  supply  node  was  at  port  3.  All  the  transmission  lines  were 
terminated  at  port  2  using  a  2  pF  capacitance  and  port  4  was  used  for  probing.  Fig. 
3.14(a)  shows  the  plane  pair  used  to  model  the  power  supply  noise  when  4  drivers  are 
simultaneously  switching.  The  IBM  driver  was  modeled  using  RNN.  The  regressor 
vector  X  consists  of  present  samples  of  power  supply  voltage  and  driver  output  voltage 
and  past  samples  of  driver  output  current,  power  supply  current,  power  supply  voltage 
and  output  voltage.  The  RNN  model  for  sub-functions//, 2  and  sub-functions require 
one  hidden  layer  with  2  hidden  neurons.  BPTT  training  algorithm  was  used  to  estimate 
the  weights  of  the  RNN  model.  A  third  order  spline  function  macro-model  with  two  time 
instances  back  was  generated.  Power  supply  noise  at  ports  1,  2  and  4  was  measured  for 
RNN  model,  spline  function  with  finite  time  difference  approximation  model  and 
transistor  level  driver  model.  It  can  be  seen  from  Fig.  3.14(b)  that  RNN  model  captures 
SSN  accurately  when  all  4  drivers  are  switching  simultaneously,  whereas,  spline  function 
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with  finite  time  difference  model  fails  to  accurately  model  SSN.  RNN  model  is  6-7  times 
faster  and  spline  model  is  13-15  times  faster  than  transistor  level  driver  model,  when  the 
time  domain  simulations  are  carried  out  for  3  cycles.  The  computational  time  speed  up 
between  the  RNN  model  and  actual  transistor  level  model  increases  with  the  increase  in 
complexity  of  the  circuit. 


Fig.  3.14(a)  Plane  pair  model  generated  using  cavity  resonator  method.  Both  planes  have  4  ports  each. 
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Fig.  3.14(b)  Simultaneous  Switching  Noise  (SSN)  at  ports  pi,  p2  and  p4  when  10  identical  drivers  are 
switching  together.  SSN  from  actual  transistor  level  driver  model  (straight  line)  RNN  model  (dotted  line) 
and  spline  function  with  finite  time  difference  approximation  model  (dashed  line). 


3.2.3.4  Receiver  Modeling  Methodology 


In  order  to  model  the  input  characteristics  of  the  receiver  accurately,  the  input 
current,  /,«,  should  be  accurately  modeled.  The  input  current  can  be  expressed  as  equation 
(39) 

(39) 


x{k)  = 


hn  -  1).  Un  ik  -  2),...,  ii„  (k  -  /?),! 
Vi„  (k),  V,„  (k  - 1),...,  V,.„  (k-p)  j 


(40) 
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where,  x(k)  takes  into  account  all  the  previous  time  instances  of  receiver  input  voltage, 
Vin,  and  input  current,  Function /„  is  a  non-linear  RNN  function  as  shown  in  equation 
(41) 


M  N 

f.  (jc)  =  Z  hjg{  z  a jiXi  +  a  • )  + 

'«  k=\  7=1 

(41) 

g(x)  =  (e'‘  +e~^) 

(42) 

where,  a  and  b  are  weights  associated  with  the  recurrent  neural  network,  N  represents 
number  of  hidden  neurons,  M  represents  number  of  outputs.  The  weights  are  estimated 
using  Back  Propagation  Through  Time  (BPTT)  algorithm.  To  estimate  fi„  accurately  a 
good  training  data  is  required  that  should  excite  both  the  linear  and  the  non-linear  regions 
of  the  receiver  input  current  [7]. 

3.2.3.5  Receiver  Output  Characteristics 

The  output  voltage  of  the  receiver  is  modeled  using  a  combination  of  static  receiver 
characteristics  (v,„  -  Vout)  and  a  delay  element  to  capture  the  input-output  delay  of  the 
receiver.  The  input-output  delay  of  a  receiver  is  dependent  on  the  slew  rate  of  the  input 
voltage  and  independent  of  the  frequency  of  excitation  for  normal  range  of  operations. 
The  delay  for  the  particular  slew  rate  can  be  easily  estimated  and  the  relation  between 
input  and  output  voltages  can  be  captured  using  the  static  relation.  Thus,  the  combination 
of  the  two  would  result  in  accurate  modeling  of  the  receiver  output  voltage. 

Test  Case: 

Test  Case  J:  A  realistic  test  case  was  setup  where  an  IBM  DDR2  driver  was  connected  to 
a  75  ohm  ideal  transmission  line  with  a  0.2ns  time  delay.  The  transmission  line  was  in 
turn  connected  to  a  DDR2  receiver.  A  RNN  model  was  used  to  model  the  DDR2  driver 
and  DDR2  receiver  was  modeled  using  the  proposed  technique.  The  driver  was  excited 
with  a  2.5V  pulse  with  0.25ns  rise  time.  The  voltages  at  the  near  end  of  the  transmission 
line,  the  input  of  the  receiver  and  the  output  of  the  receiver  were  plotted  as  shown  in  Fig. 
3.15.  It  can  be  clearly  seen  that  the  modeled  results  match  well  with  the  IBM  DDR2 
transistor  level  receiver  model.  The  transistor  level  DDR2  setup  takes  290  seconds  for 
computation  and  the  macro-model  takes  less  than  4  seconds  for  simulation.  The  macro¬ 
model  is  very  accurate  and  gives  a  timing  error  of  less  than  20  ps 
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Fig.  3.15  Driver  output  for  IBM  DDR2  (straight  line)  and  for  macro-model  (dotted  line);  Receiver  input  for 
IBM  DDR2  (straight  line)  and  for  macro-model  (dotted  line);  receiver  output  for  IBM  DDR2  (straight  line) 

and  for  macro-model  (dotted  line). 


3.2.3.6  Non-linear  macro-modeling  comparison 


Test  Case 
Description 

Simulation  Time  Consumed 

Speed-up 

Transistor  Level 
Circuit  Model 

Macro-Modeling 

Technique 

IBM  AGP  Driver 
connected  to  a  50 
ohm  transmission 
line 

771  s 

Spline  function 
model  (7  s) 

IlOX 

16  IBM  HSTL_A 
driver  connected  to 
a  plane  pair  for  SSN 
estimation 

3096  s 

Spline  function 
macro-model 
(150  s) 

20X 

10  IBM  HSTL_A 
driver  connected  to 
a  plane  pair  for  SSN 
estimation 

550  s 

Spline  function 
macro-model 
(70s) 

8X 

4  IBM  DDR2 
receiver  connected 
to  a  plane  pair  for 
SSN  estimation 

560  s 

RNN  model 
(80  s) 

IX 

IBM  DDR2  receiver 
eircuit  eonnected  to 
a  50  ohm 

96  s 

Statis  model  with 
finite  time  delay 
element 

25X 
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transmission  line 

(4  s) 

IBM  DDR2  driver 
and  receiver  circuits 
connected  to  a  50 
ohm  transmission 
line 

290  s 

RNN  macro-model 
for  driver  and  static 
model  with  finite 
time  delay  element 
(4  s) 

70X 

3.3  Modeling  Results  and  Benchmarking 

a)  A  32  bit  bus: 

BEMP  was  used  in  macro-modeling  of  a  32-bit  wide  interconnect  bus  on  a  Delta- 
Sigma  modulator  chip  designed  by  HRL.  Fig.  3.9(a)  shows  the  interconnects  (32-bit  bus) 
on  a  Delta-Sigma  modulator  chip  deigned  by  HRL.  The  32-bit  bus  shown  in  Fig.  3.9(b) 
was  modeled  using  an  EM  simulator  that  is  proprietary  to  HRL.  Therefore,  only  the 
frequency  response  of  the  bus  was  provided.  Using  this  information,  a  macromodel  of  the 
32-bit  bus  was  developed.  The  comparison  of  the  magnitude  in  logarithm  scale  between 
the  given  frequency  response  of  the  bus  and  the  frequency  response  of  the  macromodel 
using  BEMP  version  3.0  is  shown  in  Fig.  3.10.  This  result  in  Fig.  3.10  is  for  Y(l,19), 
which  was  screen-captured  when  BEMP  version  3.0  was  run  on  Windows.  The  deviation 
(RMS  error)  in  Fig.  3.10  shows  the  accuracy  of  the  macromodel.  For  this  test  case,  it  is 
important  to  note  that  only  diagonal  elements  were  used  for  the  approximation  of  32-port 
admittance  parameters.  Hence,  as  compared  to  an  original  computation  time  of  C  =  0{K  x 
7V^)  =  0{K  X  1.-0737  x  10^),  where  N=32  and  W  is  a  constant,  a  computation  time  of  C  = 
0(K  X  A^^)  =  0(K  X  32768)  was  achieved,  which  resulted  in  a  speed  up  of  32767. 

The  32-port  macromodel  was  donnected  to  a  3.3V  dc  source  having  lOOps  rise 
and  200ps  fall  times  at  port  1,  as  shown  in  Fig.  3.11.  Fig.  3.12  show  the  time-domain 
waveform  when  all  the  ports  were  terminated  with  30-  resistors,  which  was  used  to 
simulate  the  crosstalk  between  ports  1  and  9. 
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Fig.  3.10:  Comparison  of  admittance  parameter  Y(l,9) 
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300ps 


3.3V 

Voltage  0 


\  A 

loops  200ps 


Fig.  3.11  Circuit  being  simulated 


30  ohm  temiination  at  all  ports 


Fig.  3.12:  Crosstalk  between  ports  1  and  9 


b)  A  4-port  power  plane  pair  with  de-coupling  capacitors: 

A  4.0in  X  6.0in  power  plane  pair  with  a  dielectric  thickness  of  62mil  is  shown  in 
Fig.  3.13.  Six  decoupling  capacitors  represented  as  RLC  circuit  elements  were  connected 
to  the  power  plane.  A  total  of  1000  uniformly  spaced  samples  were  generated  from  1.0 
MHz  to  4.0  GHz  using  the  cavity  resonator  model.  Although  the  frequency  response  of 
the  power  plane  pair  with  decoupling  capacitors  is  highly  irregular,  the  entire  frequency 
domain  for  admittance  parameters  was  divided  into  20  uniform  subbands  having  50 
samples  per  subband  without  overlapping  subbands.  The  number  of  local  and  global 
iterations  for  the  pole  replacement  method  was  1  and  200,  respectively.  Using  the  orders 
NS  of  4  and  DS  of  5  within  each  subband,  the  number  of  complex  conjugate  poles  and 
real  poles  extracted  was  78  and  1,  respectively.  The  comparison  between  original  data 
and  the  frequency  response  of  the  constructed  macromodel  for  admittance  parameters 
over  a  bandwidth  of  4.0  GHz  is  shown  in  Fig.s  3.14  and  3.15. 


49 


Dielectric  Properties:  Er=S.05  Loss  tangent=0.018  Coii(luctivitj'=5.7e+7 
Decoupling  Capacitors: 

Fig.  3.13:  Power  plane  pair  with  decoupling  capacitors 


Fig.  3.14:  Comparison  of  real  part  of  admittance  parameters  for  the 
power  plane  pair 


’  'GHz'  “  ‘  ’  ‘GHz'  ‘  ‘  '  ‘GHz'  '  '  =  ‘  GHz' 

Fig.  3.15  Comparison  of  imaginary  part  of  admittance  parameters  for 
the  power  plane  pair 

b)  4-port  transmission  line  data  from  measurement: 

A  total  of  799  uniformly  distributed  frequency  samples  from  50.0  MHz  to  20.0 
GHz  for  a  4-port  transmission  line  was  measured  using  a  vector  network  analyzer.  The 
entire  frequency  band  was  divided  into  16  subbands  having  a  bandwidth  of  1.25  GHz. 
The  macromodel  was  constructed  from  the  lower  subband  using  the  orders  NS  of  8  and 
DS  of  9  in  all  subbands.  The  number  of  local  and  global  iterations  for  the  pole 
replacement  method  was  2  and  200,  respectively.  The  number  of  complex  conjugate 
poles  and  real  poles  extracted  were  224  and  1,  respectively.  Fig.s  3.16  and  3.17  show  the 
comparison  between  original  frequency  data  and  the  response  of  the  constructed 
macromodel  for  real  (Y1 1)  and  imaginary  (Y12)  admittance  parameters,  respectively. 

The  macromodel  was  connected  to  a  400mV  source  having  5  Ops  rise/fall  times 
and  a  500  termination  at  port  1.  Resistors  with  a  500  value  were  terminated  at  ports  2,  3, 
and  4.  The  time-domain  waveform  at  ports  1  and  2  are  shown  from  0  to  30ns  in  Fig.  3.18, 
demonstrating  stability  and  passivity. 
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Imaginary  Y12  RealYll 


3.4  Summary 

BEMP  is  a  software  tool  for  generating  macro-models  of  arbitrary  multi-port  passive 
networks,  which  can  be  used  in  a  circuit  simulator  like  SPICE.  These  macro-models 
accurately  capture  the  electromagnetic  effects  of  the  passive  networks  which  are 
important  in  the  time-domain  simulation  of  digital  and  mixed-signal  systems.  The 
software  is  coded  using  the  C++  programming  language  and  is  executable  on  the 
Windows  operating  system.  The  program  accepts  the  S,  Y  or  Z  parameters  of  passive 
networks  in  the  Touchstone  format.  The  output  is  a  lumped  element  sub-circuit  file  which 
can  be  easily  integrated  into  a  SPICE  simulation.  BEMP  also  generates  a  macro-model  of 
the  input  structure  represented  by  rational  functions  that  can  be  used  in  frequency  domain 
analysis  and  simulation.  The  current  version  BEMP  v3.0  incorporates  macro-modeling  of 
broadband  multi-port  frequency  response  data,  guaranteeing  stability  and  passivity  in  the 
generated  macro-models.  The  program  has  been  tested  for  a  variety  of  systems  ranging 
from  coupled  transmission  lines  to  irregular  shaped  power/ground  planes  to  a  high-speed 
data  bus.  The  simulations  using  the  generated  macro-models  have  demonstrated  good 
model  accuracy.  The  program  is  being  further  developed  to  enforce  causality  on  the 
generated  macro-models  and  include  the  macro-modeling  of  nonlinear  devices  like 
drivers  and  receivers.  These  updates  will  be  incorporated  in  BEMP  v4.0. 

References; 

1.  S.  Min  and  M.  Swaminathan,  ’’Efficient  Construction  of  Passive  Macro-models 
for  Resonant  Networks”,  Proceedings  of  the  10th  Topical  Meeting  on  Electrical 
Performance  of  Electronic  Packaging,  Boston,  Massachusetts,  Oct.  2001. 

2.  S.  Min  and  M.  Swaminathan,  "Construction  of  broadband  passive  macro-models 
from  frequency  data  for  distributed  interconnect  networks",  accepted  for 
publication  in  the  IEEE  Transactions  on  Electromagnetic  Compatibility. 

3.  R.  Mandrekar  and  M.  Swaminathan,  “Delay  Extraction  from  Frequency  Domain 
Data  for  Causal  Macro-modeling  of  Passive  Networks”,  Submitted  to  the 
International  Symposium  on  Circuits  and  Systems  2005. 

4.  K.  Judd  and  A.  Mees,  “On  Selecting  Models  for  Nonlinear  Time  Series,”  Physica 
D,  Vol.  82,  pp.  426-444,  1995. 

5.  Bhyrav  Mutnury,  Madhavan  Swaminathan  and  James  Libous,  “Modeling  of 
Power  Supply  Noise  using  Efficient  Macro-Model  of  Non-Linear  Driver,”  2004 
IEEE  International  Symposium  on  EMC.  San  Jose  (CA),  USA,  August  9-13, 
2004. 

6.  Bhyrav  Mutnury,  Madhavan  Swaminathan  and  Jim  Libous,  “Macro-Modeling  of 
Non-Linear  Digital  EO  Drivers,”  accepted  for  IEEE  Transactions  on  Advanced 
Packaging,  2004 

7.  Bhyrav  Mutnury,  Madhavan  Swaminathan,  Moises  Cases,  Nam  Pham,  Daniel  N. 
de  Araujo  and  Erdem  Matoglu,  "Macro-Modeling  of  Transistor  Level  Reeeiver 
Circuits,"  accepted  at  13‘^  Tropical  Meeting  on  Electrical  Performance  of 
Electronic  Packaging  (EPEP  2004),  Portland,  Oregon,  October,  2004 


53 


CHAPTER  4  PHYSICS  BASED  REDUCED  ORDER  MODELING 

METHODS 


4.1  Mathematical  Models 

The  primary  test  device  for  the  NeoCAD  algorithms  developed  in  this  program  is 
the  fourth-order  continuous  time  delta-sigma  modulator  (ASM)  [4-1],  for  which  a  block 
diagram  is  shown  in  Fig.  4-1.  The  RF  signal  enters  the  modulator  on  the  left  at  the  input 
gain  cell  go  and  the  output  stream  of +rs  and  -I’s  is  obtained  following  the  third  latch  at 
the  right.  The  principle  components  of  the  modulator  are  transconductance  amplifiers 
(go-s))  transimpedance  integrators  (l/sCj),  digital-to- analog  converters  (DACs)  (g6-9),  and 
a  voltage  amplifier  Qn,,  3  latches  and  an  external  clock  that  drives  the  latches.  We  have 
developed  behavioral  models  for  each  of  these  components  starting  from  simplified 
transistor-level  models.  We  have  also  developed  full  circuit  level  models  of  these 
components  as  discussed  in  previous  sections.  The  mathematical  analysis  starts  from  a 
simple  linear  model  of  the  AZM  with  the  quantizer  approximated  by  the  sampled  “sign” 
function  as  discussed  in  section  4.1.2.  The  output  of  the  modulator,  which  is  a  stream  of 
+1  and  -I’s,  is  multiplied  by  a  window  function  (the  Kaiser  window  is  used  for  all  our 
calculations),  and  the  discrete  Fourier  transform  of  these  numbers  gives  the  modulator 
spectrum.  The  signal-to-noise  ratio  can  be  obtained  directly  from  the  spectrum. 


Fig.  4-1.  Schematic  block  diagram  of  the  4*  order  continuous  time  delta  sigma  modulator. 


4.1.1  Idealized  Delta-Sigma  Modulator  Model 

The  linear  model  of  the  4*’’  order  AZM  is  given  by  the  following  set  of  equations: 
dxi(t) 


Cl 

Cl 

Cs 


dt 

dxi{t) 

dt 

dt 


=  go  U(t)  -  gi  Xiit)  -  g6  yit  -  At) 

=  gi  xi(t)  -  gj  y(t  -  At) 

=  ^3  xiit)  -  g5  X4(t)  -  gs  yit  -  At) 


4-1 

4-2 

4-3 
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dx^it) 

Q  — - —  =  g4  Xsit)  -  g9  yit  -  At) 
ot 

yit)  =  sgn[:v4  (4)  +  Qfb  y(t  -  At)]  for  ^  t  <  4+1 


4-4 

4-5 


where  x\  to  x^  are  the  state  variables  following  each  of  the  4  integrators,  Ci  to  C4  are  the 
capacitances  of  the  integrator,  to  are  the  coefficients  of  the  gain  cells,  y  is  the  output 

of  the  AEM  and  Qfb  is  the  quantizer  feedback  coefficient.  The  subscript  n  in  the  equation 
for  j(r)  represents  the  n'*’  clock  period  and  At  represents  the  effective  delay  of  the 
feedback  DAC.  Note  that  eqs.  4-1  to  4-5  are  linear  between  clock  periods  (t„<t<  t„+j). 

We  have  developed  a  set  of  parameters  that  lead  to  good  DSM  performance  as 
given  in  Table  1 .  The  input  signal  u(t)  is  given  by 

u(t)  =  «o  sin(27t  vt).  4-6 

where  «o  is  the  amplitude  and  v  is  the  frequency. 


^0=  1.660  10'' 

C,=  l.pF 

gf  =  6.383  10'' 

C2=  l.pF 

g2=  6.383  10'' 

C3=  l.pF 

^3=  2.500  10'' 

C4=  l.pF 

g4  =  6.383  10-' 

0fb  =  0.1646 

gs  =  6.383  10'' 

Tc  =  0.25  ns 

^6  = -4.833  10''’ 

A/ =  0.125  ns 

g7  = -1.933  10'' 

V  =980  MHz 

;?8  = -3.332  10'' 

Mo  =  0.5 

g9=  8.255  10'' 

Table  4-1.  Simulation  parameters  for  4*  order  continuous  time  delta-sigma  modulator 


Results  obtained  by  solving  eqs.  4-1  to  4-5  for  input  given  by  eq.  (4-6)  and  parameters 
given  in  Table  4-1  are  shown  in  Fig.  4-1  for  a  16384-clock  period  calculation.  Fig.  4-1 
also  shows  independent  Simulink  calculations  for  the  same  input  and  parameters.  Note 
the  good  agreement  between  the  two  results  including  the  deep  “notch”  at  ^i/(27t  Ci)  = 
1.016  GHz.  The  signal  peak  at  980  MHz  is  identical  in  each  case.  The  details  of  the 
noise  peaks  vary  slightly  from  one  calculation  to  another,  but  the  envelopes  are  in  good 
agreement. 
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(b) 


Fig.  4-2.  Discrete  Fourier  Transform  (dB)  of  quantizer  output  multiplied  by  Kaiser  Window  as  a  function 
of  frequency.  Simulink  calculations  are  in  blue;  Mathematica  calculations  are  in  red.  (a)  Full  range,  (b) 
Blow  up  of  the  “notch”  with  the  Kaiser  Window  shown  as  an  insert. 

The  signal  can  be  moved  through  notch  by  changing  the  input  frequency.  The  average 
noise  in  the  notch  should  remain  unchanged  as  shown  in  Fig.  4-3. 
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Fig.  4-3.  Calculations  of  ASM  spectra  for  signals  with  variable  input  frequency  showing  that  the  average 
ASM  noise  level  is  independent  of  the  signal  frequency. 

The  basic  gain  cell  model  is  shown  in  Fig.  4-4(a).  The  input  voltage  V  and  output 
current  I  are  related  by 

V=  IR  +  Ft  ln[(/ee+7)/(/ee-/)]  (4-7) 

where  Zee  is  the  value  of  the  current  source,  R  is  the  resistance  shown  in  Fig.  4-4a  and  Ft 
is  the  thermal  voltage  ( ~  0.025  V  at  room  temperature).  Fig.  4.4(b)  compares  the  linear 
approximation  to  the  exact  solution  of  eq.  4-7  for  the  I-V  curve. 


Fig.  4-4.  Gain  cell  model,  (a)  Circuit  schematic,  (b)  Nonlinear  (solid  line)  and  linear  approximation 

(dashed  line)  to  the  I-V  curve  of  the  gain  cell. 


Fig.  4-5  shows  the  effect  of  gain  cell  saturation  in  cell  gi  on  ASM  SNR  as  a 
function  of  lee  Note  that  to  keep  the  small  signal  gain  coefficient  at  6.383  10'^  as  given 
in  Table  4-1,  it  is  necessary  to  vary/?  and  lee  simultaneously;  as  can  be  seen  from 
eq.  (4-7),  varying  lee  alone  changes  the  small  signal  gain  coefficient.  The  SNR  plotted  in 
this  and  subsequent  Fig.s  is  defined  as  the  ratio  of  the  power  in  the  signal  divided  by  the 
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integral  of  the  noise  spectrum  over  the  specified  frequency  band  centered  on  the 
minimum  of  the  notch. 


2  MHz,  4  MHz,  8  MHz 


he,  mA 

Fig.  4-5.  SNR  as  a  function  of  /ee  in  gain  cell  1  for  the  baseline  4”’  order  ASM.  The  SNR  is  given  for  3 
values  of  the  noise  bandwidth  centered  about  the  minimum  of  the  notch. 

Gain  cell  saturation  leads  to  a  second  deleterious  effect,  intermodulation  products, 
when  the  input  signal  consists  of  two  or  more  tones.  To  investigate  this  effect,  we 
allowed  gain  cell  0  to  saturate  and  used  an  input  signal  having  the  form  of  eq.  (4-6)  plus  a 
second  tone  50  dB  down  in  power  and  2  MHz  lower  in  frequency  than  the  primary  signal. 
Visualization  of  spurs  in  ASM  spectra  is  critically  dependent  on  the  number  of  clock 
periods  in  the  simulation  as  shown  in  Fig.  4-6.  For  signals  separated  by  2  MHz,  spur 
resolution  requires  between  64000  and  260,000  clock-period  simulations. 


Frequency  (GHz) 


1.01  14)12  1.014  1.016  1.018  14)2 

Fr«quancy  (GHz) 


fnr'Ayy 

01  1.012  1.014  1.016  1.016  1.0 

Fr*qu»ney  (GHz) 

■1 

nnjijjl 

sB 

14)1  1.012  14)14  1.016  1.018  1.02 

Frtquqney  (GHz) 


Fig.  4-6.  Spur  resolution  as  a  function  of  number  clock  periods  in  the  simulation:  from  upper  left  to  lower 
right:  16384,  65536,  262144,  and  1048576  clock  periods. 
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4.1.2  Quantizer 


In  the  previous  sections  the  quantizer  was  modeled  by  the  sign  function  as 
described  in  eq.  (4-5).  In  this  section,  we  generalize  the  quantizer  behavioral  model  to  a 
simplified  circuit  model.  In  the  4*’’  order  AEM  shown  in  Fig.  4-1,  the  quantizer  consists  of 
three  latches  in  series.  The  operation  of  each  latch  is  controlled  by  the  output  from  the 
clock  located  elsewhere  in  the  integrated  circuit  and  described  in  more  detail  in  Section 
4.1.5.  Here  we  assume  that  the  clock  circuit  output  is  a  perfect  square-wave  voltage 
signal  with  period  TV  and  peak-to-peak  amplitude  Fdock-  Mathematically,  the  clock 
voltage  is  then  given  by 

Veit)  =  +Fciock/2  for  /„<=  t<t„+Tc/2 

-F'ciock/2  for  t„+Tcl2<=t<t„+Tc  =  tn+i  (4-8) 

Each  latch  can  be  modeled  as  two  RC  circuits  and  two  gain  cells  as  shown  in  Fig.  4-6. 
This  circuit  operates  in  two  modes  that  depend  on  the  sign  of  the  clock  signal  and  that  we 
will  refer  to  as  “tracking”  and  “latching”.  If  the  clock  is  positive,  the  circuit  operates  in 
the  tracking  mode;  if  negative,  it  latches.  Physically,  we  can  describe  the  tracking 
behavior  as  the  situation  in  which  the  output  voltage  of  the  circuit  depends  on  the  input 
voltage  over  the  entire  time  interval  (half  the  clock  period  in  this  case).  While  the  output 
in  the  tracking  mode  is  not  an  amplified  replica  of  the  input,  it  does  follow  the  input 
behavior  to  some  extent.  In  the  latching  mode  the  output  voltage  of  the  circuit  depends 
only  on  the  input  voltage  at  the  beginning  of  the  half  clock  period — hence  the  concept  of 
latching  onto  the  signal  at  a  given  instant. 


Fig.  4-7.  Schematic  diagram  of  the  latch  circuit. 

The  output  voltage  V  of  the  latch  circuit  shown  in  Fig.  4-7  can  be  described  by  the 
following  differential  equation: 
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dVif) 

dt 


1 

+  V(t)  =  —  I^R  tanh 
2 


Veit)  \ 

^^ock/2  / 


H —  /ee  ^  tanh 
2 


1  + 


Vc(t)  \ 

^^ock/2  / 


(4-9) 


where  Vo(t)  is  the  input  and  V{t)  is  the  output  voltage,  t  =  i?  C  is  the  RC  time  constant,  /ee 
is  the  current  produced  by  the  current  source,  and  Ft  is  the  thermal  voltage,  Boltzman’s 
constant  times  temperature,  k^T.  The  terms  on  the  left  hand  side  of  eq.  (4-9)  are  the 
standard  RC  time  constant  terms.  The  terms  on  the  right  hand  side  can  be  derived  from 
the  conventional  transistor  equation  with  the  resistor  at  the  emitter  set  to  zero 

V=IRe  +  F,  ln[(/ee+7)/(/ee-^]  (4-10) 

which  can  be  solved  for  /  when  Re  =  0: 

/  =  /eetanh(  F/Ft)  (4-11) 

The  delta  sigma  modulator  shown  in  Fig.  4-1  uses  the  three  latches  at  the  right  as  a 
quantizer;  each  latch  is  described  by  eq.  (4-9).  The  only  difference  between  them  is  that 
the  latch  in  the  middle  is  driven  by  “clock  bar”  which  is  defined  as  the  negative  of  the 
clock  voltage,  so  that  response  of  the  middle  latch  can  be  described  as  latching  during  the 
first  half  of  the  clock  period  when  the  clock  is  positive  and  tracking  during  the  second 
half  of  the  period. 


Typical  values  of  the  parameters  are  J?  =  300  ohms,  lee  =  1  niA,  Ft  =  0.02586V, 
and  C  =  30  fF.  Before  solving  for  the  three-latch  system  it  is  useful  to  look  at  the 
properties  of  eq.  (4-9)  for  positive  and  negative  clock  signals.  If  the  clock  is  negative,  the 
latch  mode  is  operational  and  eq.  (4-9)  reduces  to 


dv(r) 

- l-v(/)  =  ;8tanh[v(r)] 

At 


(4-12) 


where  P=RIee  I  Vi  =5.8  and  t  has  been  normalized  to  the  RC  time  constant.  This 
equation  has  the  solutions  shown  in  Fig.  4-8  for  values  of  the  input  voltage  Fn  at  t  =  0. 
Note  that  for  V\n«  1,  the  solutions  to  eq.  (4-12)  require  many  RC  time  constants  to  latch 
to  the  input  signal.  When  the  latch  has  a  low  input  signal  and  a  long  RC  time  constant, 
there  may  not  be  enough  time  for  the  output  to  regenerate  to  a  logical  level.  This  causes 
non-ideal  inputs  to  the  DAC,  and  potentially  leads  to  inconsistencies  between  the 
feedback  DAC  signal  and  the  digitized  output  value  that  result  in  a  degraded  SNR. 
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Input  Voltage  =  10  to  10“ 


Fig.  4-8.  Output  voltage  as  a  function  of  time  for  the  comparator  operating  in  the  latch  mode. 


For  a  latch  in  the  “following”  mode  eq.  (4-9)  reduces  to: 

dv(/) 

- -I-  v(t)  =  j8  tanh[yS  Fb(/)/jR/ec] 

d?  (4-13) 

This  equation  has  the  usual  quadrature  solution  with  two  terms;  one  corresponds  to  decay 
of  the  initial  condition  and  the  second  to  the  response  of  an  RC  circuit  driven  by  a 
possibly  saturated  gain  cell.  For  Voit)  sufficiently  small  the  hyperbolic  tangent  function 
may  be  approximated  by  its  argument,  and  the  input  voltage  drives  the  RC  circuit 
directly.  If  the  argument  of  the  tanh  is  large,  its  value  is  plus  or  minus  one  depending  on 
the  sign  of  Fo(/)  and  the  ouput  of  the  RC  circuit  reflects  little  more  than  the  sign  of  Voit). 

We  have  solved  the  equations  for  the  three-latch  system  shown  in  Fig.  4-1  with 
the  results  shown  in  Figs.  4-9  to  4-1 1.  The  rows  shown  in  Figs.  4-9  to  4-1 1  correspond  to 
the  input  Vo{t)  ,  the  two  internal  voltages  Vi(t)  and  Vjit)  and  the  output  Vi(t).  The 
columns  correspond  to  the  first  5  clock  periods.  Note  that  the  input  voltage  is  normalized 
to  its  peak  value  while  the  internal  and  output  voltages  are  normalized  to  R  /ge  =  0.3  V. 
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Fig.  4-9.  Input  voltage  Ko(/) ,  the  two  internal  voltages  F|(f)  and  V2{t)  and  the  output  voltage  V^it)  as  a 
function  of  time  in  ns  for  a  clock  period  of  0.25  ns  and  an  RC  time  constant  of  9  ps.  The  input  voltage  to 
the  quantizer  is  a  sine  wave  0.6  V  peak  to  peak  and  a  period  of  0.667  ns. 

Fig.  4-10  illustrates  the  performance  for  input  voltage  =  0.0006  V  and  the  same  9 
ps  RC  time  constant.  Note  that  while  V\(t)  and  V2{t)  are  substantially  different,  the 
quantized  output  from  the  third  latch  is  essentially  the  same.  Fig.  4-11  illustrates  the 
performance  for  input  voltage  =  0.0006  V  and  a  27  ps  RC  time  constant.  Now  the 
quantized  output  contains  2  bit  errors  in  the  first  5  bits. 


62 


1  1.05  1.1  1.15  1.2  1.25  1  1.05  1.1  1.15  1.2  1.25  1  1.05  1.1  1.15  1.2  1.25  1  1.05  1.1  1.15  1.2  1.25 


Fig.  4-10.  Input  voltage  Fo(0  ,  the  two  internal  voltages  Fi(t)  and  ^2(0  and  the  output  voltage  F3(t)  as  a 
function  of  time  in  ns  for  a  clock  period  of  0.25  ns  and  an  RC  time  constant  of  9  ps.  The  input  voltage  to 
the  quantizer  is  a  sine  wave  0.0006  V  peak  to  peak  and  a  period  of  0.667  ns. 


1  1.05  1.1  1.15  1.2  1.25  1  1.05  1.1  1.15  1.2  1.25  1  1.05  1.1  1.15  1.2  1.25  1  1.05  1.1  1.15  1.2  1.25 

Fig.  4-11.  Input  voltage  Fo(0  ,  the  two  internal  voltages  Ft(()  and  V2(0  and  the  output  voltage  ^3(0  as  a 
function  of  time  in  ns  for  a  clock  period  of  0.25  ns  and  an  RC  time  constant  of  27  ps.  The  input  voltage  to 
the  quantizer  is  a  sine  wave  0.0006  V  peak  to  peak  and  a  period  of  0.667  ns. 

The  behavioral  model  of  the  quantizer  has  been  used  in  place  of  eq.  (4-5)  to 
simulate  the  ASM  with  the  results  shown  in  Fig.  4-12.  Note  that  the  noise  level  in  the 
blue  curve  for  the  circuit  level  model  is  slightly  increased  from  the  more  ideal  calculation 
in  red  based  on  eqs.  (4-1)  to  (4-5).  The  signal  is  identical  in  each  case  and  the  location  of 
the  notch  is  the  same  to  within  simulation  error. 
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4th  Order  D-S  Modulator,  980-MHz  Input  Signal 
Simple  Quantizer/DAC-Circuit  Level  Quantizer/DAC 


Fig.  4-12  Absolute  value  of  the  discrete  Fourier  transform  of  the  output  of  the  fourth  order  delta  sigma 
modulator  times  the  Kaiser  window  as  a  function  of  frequency  in  GHz. 


4.1.3  Clock  Model 

The  square  wave  clock  model  has  been  revised  to  include  the  finite  rise  time  as 
shown  by  Fig.  4-13.  This  clock  function  is  used  in  the  complete  AEM  module  discussed 
in  Section  4.1.8.  For  the  rise  time  shown  in  Fig.  4-13,  approximately  0.02  ns,  no 
significant  effect  on  ASM  noise  shaping  is  observed  as  shown  in  Fig.  4-14  which 
compares  the  spectra  with  the  square  wave  clock  to  that  with  the  clock  waveform  shown 
in  Fig.  4-13. 


Fig.  4-13.  Clock  voltage  as  a  function  of  time  with  finite  rise  time. 
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Square  Wave  Clock/Behavioral  Clock  Model 


Fig.  4-14.  High  resolution  DSM  spectra  for  a  square-wave  clock  and  the  behavioral  model  of  the  clock 
shown  in  Fig.  4-13. 


4.1.4  Digital  to  Analog  Converter 

In  the  simple  linear  model  of  the  DSM  given  in  Section  4.1.2,  the  DAC  is 
essentially  perfect  and  the  output  of  the  “sign”  quantizer  is  fed  back  directly  into  the 
DSM.  A  two-transistor  model  of  the  DAC  is  shown  in  Fig.  4-15.  Here  the  input  voltage 
is  Fin  and  the  output  current  /out- 

^out 


Fig.  4-15:  Two-transistor  model  of  DAC. 
The  equations  that  characterize  this  device  are 


Fin  -  Fbel  +  Fbe2  “  0 

(4-18) 

Fbel  =  Fti  ln(/ei//.) 

(4-19) 

Fbe2=Ft2  ln(/e2//s) 

(4-20) 

/cl  "t  /c2  “  /tail 

(4-21) 

/out  “  (/cl  “  Itl)’ 

(4-22) 

Fbei  and  Fbe2  are  the  base-emitter  voltages  on  the  two  transistors,  Id  and  are  the 
collector  currents,  Fti  and  Ft2  are  the  thermal  voltages  at  temperature  Ti  and  Ti,  /tail  and 
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/s  are  the  tail  and  saturation  currents.  For  Vt\  =  ,  the  equations  have  the  simple 

solution 

/„„t  =  /eetanh  (Fin/2 Ft).  (4-23) 

This  equation  has  been  used  to  model  the  DACs  when  differential  heating  (unequal  Fti  = 
Ft2)  is  ignored.  It  leads  to  little  change  in  the  DSM  behavior  because  Fn »  2  Ft,  and  the 
tanh  can  generally  be  replaced  by  1  or  -1,  which  is  the  approximation  used  in  eqs.  (4-1) 
to  (4-5). 

Unfortunately,  the  input  voltages  to  the  two  transistors  are  not  equal,  but  depend 
on  the  bit  stream  out  of  the  ASM.  Thus  a  large  number  of  transitionless  bits  in  a  row  can 
lead  to  relatively  large  differential  heating  in  the  two  transistors.  If  7h  and  7l  are  the 
equilibrium  temperatures  when  the  transistors  are  on  and  off,  respectively,  Ti{t)  and  T2(t) 
are  the  temperatures  of  Q1  and  Q2  as  a  function  of  time,  and  Xth  is  the  thermal  time 
constant,  the  temperatures  evolve  in  time  according  to  the  following  equations. 

Tth  — ^  =  ^Ty(t)  +  To-  Ti(t) 

dt  (4-24) 

dTiit) 

Tth  =  -^Tyit)  +  To-  T2(t) 

dt  (4-25) 

where  To  =  (Th  +  Tt)/2  and  let  AT  -  (Th  -  2l)/2.  With  unequal  temperatures,  the  simple 

analytical  solution  given  by  eq.  (4-23)  is  lost  and  the  equations  become  a  little  more 
difficult.  It  can  be  shown  that  for  sufficiently  small  differences  in  the  temperatures  of  the 
two  transistors  eq.  (4-23)  becomes 

/out  =  /ee  tanh  I  Fin/2  F,  -  Cth  iTx-T2)/To]  (4-26) 

where  Cth  is  a  constant  that  depends  on  the  device  parameters.  In  general,  there  is  other 
temperature  dependence  in  the  device  models  of  the  two  transistors  that  lead  to  a  similar 
form  of  eq.  (4-26)  so  Cth  is  best  regarded  as  an  empirical  constant,  which  for  our  devices 
is  approximately  6.  Eqs.  (4-24)  to  (4-26)  and  the  three-latch  model  of  the  quantizer  have 
been  solved  simultaneously  with  eqs.  (4-1)  to  (4-4).  Fig.  4-16  shows  SNR  as  a  function 
of  AT  for  Tth  =  1  ns.  Fig.  4-17  shows  SNR  as  a  function  of  the  thermal  time  constant  with 
AT  =  40K. 
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SNR,  dB 


Fig.  4-16.  SNR  as  a  function  of  for  a  thermal  time  constant  of  Tth  =  1  ns. 


2  MHz,  4  MHz,  8  MHz 


0  2  4  6  8  10 


Tth,  ns 

Fig.  4-17.  SNR  as  a  function  of  T,h  for  Ar=  40K. 

4.1 .5  Integrator 


Fig.  4-18  shows  an  implementation  of  a  simple  integrator  [4-1].  It  is  designed  so 
that  the  output  voltage  is  ideally  proportional  to  the  integral  of  the  input  current,  so  that 


The  circuit  is  fully  differential,  the  output  voltage  is  given  by  Vout  =  Vg  -  V7,  and  the 
input  current  is  defined  as  fn  =  Ii  - 12-  By  the  definition  of  Vout,  the  integrator  is 
noninverting.  Rcomp  compensates  for  the  limited  value  of  Te  by  bringing  the  pole  closer 
to  zero. 

We  can  then  solve  for  the  input-output  relationships  for  this  circuit  by  using 
Kirchoff  s  Current  Laws  at  each  node.  Assuming  Ri  =  Ri,  R3  =  R4,  and  Ci  =  C2  =  C,  we 
find: 


0 

II 

1 

+ 

1 

1 

+ 

<D 

0 

(4-28) 

Node  2;  7,  +  C  ^  (V,  F,)+  =  0 

'  dr "  R, 

(4-29) 

d  ,  .  y  -y 

Nodes:  C  (V,  V,)+  ’  '=0 

comp 

(4-30) 

Node  6:  C  ^  (F,  fJ+  =  0 

dt 

comp 

(4-31) 

y -y  y _f 

Node  7:  L+  ^  =  0 

R. 

comp  I 

(4-32) 

y  _i/  y  -F 

Node  8:7,+  '  '  +  '  =0 

d^comp  ^1 

(4-33) 

From  eq.  (4-23),  we  obtain 

fy-y') 

/3-/,=4tanh  (4-34) 

V  y 

We  then  assume  perfect  emitter- followers  so  that 

(4-35) 

This  gives  us  the  necessary  equations.  However,  they  can  be  simplified  for  faster 
modeling.  Subtracting  (4-29)  from  (4-28), 

/, =  0  (4-36) 

at  at  /?2 

Subtracting  (4-31)  from  (4-30), 
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(4-37) 


dr'  dr' 


V  -V 

D 

^comp 


=  0 


Subtracting  (4-33)  from  (4-32), 


i^-i^+{f,-fA 


D 

^comp 


R, 


K-F. 


R 


=  0 


comp 


(4-38) 


and  using  the  expressions  in  (4-34)  and  (4-35)  for  h  - 1^,  we  obtain 


(f -F  ^ 
tanh  ^ 
2F  I 

\  j 


+(K-K 


1  1 

■  + - 


,Komp  ) 


^ - ^  =  0 

n 

^comp 


(4-39) 


We  now  simplify,  recognizing  we  are  only  concerned  with  differential  signals.  We  let 


/in=/l-/2 

Vout=Vs-Vv 
V,=  Vi  -  V2 
Fb=F5-F6 
which  then  yields 


(4-40) 

(4-41) 

(4-42) 

(4-43) 


,  ^dF,  ^dF„  F  ^ 
L„  +  C — ^  -  C — ^  ^  =  0 

dt  dt  R-, 


(4-44) 


dF  dF  F 

dt  dt  R 


p 

comp  ^comp 


=  0 


(4-45) 


dee  tanhl 


-V„. 


^  1  P 

- +  — 

D  D 

\^comp  J 


R 


=  0, 


comp 


(4-46) 


We  can  then  describe  the  input-output  relationship  using  eqs.  (4-44)  -  (4-46).  Some 
practical  values  for  this  circuit  are  Ci  =  C2  =  1  pF,  i?i  =  i?3  =  500  Q,  7?2  =  ^4  =  10  kQ, 
and  lee  =  20  mA.  i?comp  =  2  Vdlee  =  2.586  Q. 
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This  integrator  was  placed  into  an  otherwise  ideal  behavioral  model  in  Verilog-A 
using  the  standard  NeoCAD  coefficients.  With  an  ideal  integrator,  a  4K  FFT  gives  the 
spectrum  shown  in  Fig.  4-19. 

Transient  Response  HI 

-10,0 


-30.0 
^  -30.0 

CO 

3  -T0.0 


-90.0 
-110 

A:  (yy0.204M  -14.244) - delta: 

B:  (1.0Gf977G  -78.3458)  slope;  -3.2a2Gf1u 


Fig.  4-19.  4K  FFT  of  ideal  behavioral-level  AS  modulator  in  Verilog-A. 


The  SNR  is  about  64  dB  in  1  MHz  bandwidth.  With  the  integrator  model  in  this  memo, 
the  4K  FFT  gives  the  spectrum  shown  in  Fig.  4-20. 


A:  (y90.Z34M  -U.239d) - delta:  (TF.437Vr^4.105B)' 

B:  (1.00872G  -78.3453)  slope;  -3.46747u 


Fig.  4-20.  4K  FFT  of  ideal  behavioral-level  AS,  modulator  in  Verilog-A,  with  the  exception  of  the 
integrator  model  given  by  Eqs.  (4-44)-(4-46). 


The  SNR  is  now  approximately  64  dB  in  1  MHz  BW,  which  gives  very  similar  results  to 
that  of  an  ideal  integrator. 
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The  integrator  model  given  by  eqs.  (4-44)  and  (4-46)  has  been  added  to  the  AEM 
behavioral  simulation  including  the  other  effects  discussed  in  Sections  4.1.3  to  4.1.6. 
When  the  current  ht  is  sufficiently  large,  ideal  performance  equivalent  to  solutions  of 
eqs.  (4-1)  to  (4-5)  can  be  recovered.  As  the  current  is  reduced  the  noise  shaping  of  the 
AZM  becomes  less  effective  and  the  SNR  is  reduced  as  shown  in  Fig.  4-21.  Note  that  for 
a  noise  bandwidth  of  8  MHz,  little  is  gained  by  increasing  the  integrator  current  above 
approximately  30  mA. 

2  MHz,  4  MHz,  8  MHz 


To  illustrate  the  use  of  the  behavioral  model  of  the  integrator,  we  varied  jRcomp  in 
units  of  Ft//ee  with  the  results  shown  in  Fig.  4-20.  Note  that  for  /?comp/(Ft//ee)  less  than 
about  1.25  the  integrator  SNR  drops  rapidly  as  state  variables  become  unstable.  On  the 
other  hand  for  i?comp/(Ft//ee)  greater  than  about  3  the  SNR  values  are  not  representative  of 
the  modulator  performance  since  the  spectra  clearly  show  that  the  modulator  has  turned 
into  a  resonator.  The  operating  regime  for  the  ASM  is  about  1 .25  <  i?comp/(Ft//ee)  <3.25. 


2  MHz,  4  MHz,  8  MHz 


Fig.  4-20.  AZM  SNR  as  a  function  of  the  integrator  current  Rcomph/t/ee- 


72 


4.1.6  Full  Model:  Non-ldealized  Delta-Sigma  Modulator 

The  full  set  of  equations  for  the  AEM  developed  in  the  previous  section  are  given 


by 


'ee 


y  2  y,  J  Ucomp  Rcomp 


=  0 


(4-47) 


-g6 tanh[ 


m) 

IV, 


-Cth 


(  Txit) 
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To 


‘]l 


1  +go[«(0]-^2[j«^2(0]  +  C 


dt 


Ri 


=  0 


cI[I4i(^)-^4i(^)]  x^(t) 

■C  - - - =  0 


dt 


-^comp  -^comp 


‘ec 


.  J  1  MM 

tanh  -  -  xiit)  - +  — - 

y  2  V,  }  y  iJeomp  )  R 


Vhiit) 


^i[-«i(f)]  ~ 8i  tanh[ -  Cth  — - 1  jj 
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( n(t) 


^comp 


=  0 


(4-48) 

(4-49) 

(4-50) 


^d[Vb2(t)-v^(t)]  X2(t)  ybid) 

“O  ■  .  ““  — — — ^  y 


^dlVi,2(t)-Va(t)]  VM  ^ 

1  +  c - =  0 


dt 


Ri 


dt 


■^comp  -^comp 


«ee 


y  2  Vt )  y  i2comp  Ri )  R 


Vbzit) 


(4-51) 

(4-52) 


=  0 


.\Vi(t)  (  7l(/)  \i  r  /  r  /  VT  ^ 

-gs  tanh[  -  Cth  y  yr - 1  JJ  +  -  551^^4(0]  +  C 
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(4-58). 
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dViit) 
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d  Vlit) 
dt 


8  Vi{t) 

T - 

dt 


+  ^2(^)  -  { 
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dTi(t) 
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Tiit) 

+ - 


Tth 


+  yc(t)]  K.t[Qtb  taiih[-^] +X4w]  +  y  [1  -  ydm  KAViim}  =  o 

(4-59) 

K.t[Vi(m}  =  0 
(4-60) 

KadV^it)]}  =  0 

(4-61) 

Jo  +  AT  Vi(t) 

'^th  (4-62) 


1  [1  -  vcm  Vs^tiyim  +  ^  +  ydm 

2  2 

1  [1  +  vcim  Vs^viim  +  i  [1  -  vcim 

2  2 


where  Vc(t)  is  the  clock  function  given  by  an  interpolation  function  for  the  data  plotted  in 
Fig.  4-13  and 


^at(^)  “■  -^Latch  /eeLatch  t^nh 


Use  of  the  parameters  i?Latch  and  /ecLatch  for  the  resistance  and  current  of  the  latches  used 
in  the  quantizer  allows  the  values  of  these  parameters  to  be  different  from  those  of  the 
same  parameters  of  the  integrator. 

The  first  12  equations  in  the  set  (4-47)  to  (4-62)  are  obtained  by  repetitive 
application  of  eqs.  (4-44)  to  (4-46)  for  the  four  integrators  shown  in  Fig.  4-1  that  were 
represented  by  the  simple  time  derivatives  in  eqs.  (4-1)  to  (4-4).  The  DACs  are 
represented  by  the  tanh{K3(r)/2Ft-Cth[Ti(/)/7o-l]}  terms  in  eqs.  (4-48),  (4-51),  (4-54), 
and  (4-55).  Note  that  it  is  not  necessary  to  solve  for  both  T\(t)  and  Txit)  because  they  are 
antisymmetric  for  identical  transistors  in  the  DAC.  Saturation  of  the  gain  cells  is 
included  by  the  replacement  g-,  A:i(t)with  ^i[xi(t)]  where  ^i[. . .]  indicates  the  inverse 
function  obtained  by  solving  eq.  (4-7)  [this  must  be  done  numerically  with  an 
interpolation  function  since  there  is  no  analytical  solution  for  eq.  (4-7)]  .  The  quantizer  is 
represented  by  eqs.  (4-59)  to  (4-61)  that  consist  of  3  versions  of  eq.  (4-9)  applied  to  the 
three  latches  in  series  in  Fig.  4-1.  Note  the  middle  latch  is  clocked  by  “clock  bar”,  which 
is  why  the  signs  in  the  [1-  Vc(t)]  and  [1+  Veit)]  terms  are  reversed  in  eq.  (4-60). 
Equstions  (4-47)  to  (4-62)  are  a  set  of  coupled  algebraic  and  differential  equations. 
Because  they  contain  time  constants  that  vary  by  orders  of  magnitude,  they  are  also  stiff 
differential  equations.  Thus  application  of  a  simple  differential  equation  solver  such  as 
first  order  Runge-Kutta  is  not  likely  to  be  successful.  This  is  one  of  the  underlying 
rationales  for  the  development  of  HAARSPICE. 

4.1.7  Conclusions 

tVi 

We  have  developed  a  detailed  behavioral  level  model  for  the  4  order,  continuous 
time  delta  sigma  modulator  that  is  based  on  simple  transistor-level  approximations  to  the 
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key  components  of  the  modulator:  Integrator,  Gain  Cell,  Clock,  DAC  and  Quantizer. 
Numerical  solutions  for  the  equations  that  describe  this  model  have  been  performed  to 
obtain  frequency  spectra  for  the  ASM.  These  spectra  have  been  carefully  benchmarked 
against  circuit-level  simulations  and  Simulink  calculations  that  have  been  used  in  the 
design  process  at  HRL  for  many  years.  Use  of  these  behavioral  models  has  enabled  us  to 
predict  ASM  performance  as  a  function  of  circuit  level  parameters  such  as  the  integrator 
current  and  compensation  resistance,  the  gain  cell  current  and  resistance,  the  latch  RC 
time  constants,  and  the  thermal  response  time  for  transistor  heating,  and  these 
calculations  typically  take  less  than  10  minutes  on  a  1-GHz  PC. 
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Hardware  Description  Langauge  (HDL)  Models 


Hardware  description  language  models  (HDLs)  exist  to  describe  hardware.  In  this  they 
differ  from  traditional  programming  languages,  which  generally  exist  to  describe 
algorithms.  Hardware  description  languages  have  two  primary  applications:  simulations 
and  synthesis.  With  simulation,  one  applies  various  stimuli  to  an  executable  model  that  is 
described  using  the  HDL  in  order  to  predict  how  it  will  respond.  Simulation  allows  you  to 
understand  how  complex  systems  behave  before  you  incur  the  time  and  expense  of 
implementing  them.  Synthesis  is  the  process  of  actually  implementing  the  hardware.  In 
this  chapter,  Verilog-A  models  used  in  sigma-delta  modulators  will  be  described.  In 
Section  4.2.1  we  describe  the  ideal  model  of  a  2"^*  order  continuous  time  delta-sigma 
modulator  using  linear  electronic  components  as  building  blocks.  In  section  4.2.2  we 
introduce  non-idealities  in  the  delta-sigma  modulator  and  show  gradual  changes  in  the 
performance  of  the  modulator.  Finally  we  summarize  the  results  in  section  4.2.3 

4.2.1  Ideal  Models 

Most  mixed  signal  systems  and  sub-systems  can  be  built  using  reusable  subblocks  that 
can  be  configured  in  multiple  ways  to  meet  design  specifications.  We  begin  with  the  ideal 
model  of  a  2"*^  order  delta-sigma  modulator  as  shown  in  Fig.  1 


An  ideal  2"'*  order  continuous  time  delta-sigma  modulator  can  be  modeled  as  a  linear 
circuit  in  which  the  quantizer  is  replaced  by  the  “sign”  function.  However,  the  Signal  to 
Quantization  Noise  (SQNR)  is  a  random  function  of  the  input  amplitude  and  frequency, 
and  therefore,  the  modulator  dynamics  are  not  as  simple  as  the  linear  model  predicts.  We 
begin  with  using  the  “sign  ”  function  for  the  bipolar  quantizer,  which  gives  an  output  of  ±1 . 
The  other  sub-blocks,  the  Gm-cell  and  integrator  used  in  the  loop  filter  and  the  current 
steering  digital-to-analog  converter  (DAC)  are  all  modeled  using  ideal  components.  To 
simulate  the  idealized  delta-sigma  modulator,  we  have  used  HAARSPICE  and  Cadence’s 
Spectre  circuit  simulators.  Currently  HAARSPICE  accepts  Cadence’s  spectre  netlistbut 
cannot  parse  verilog-A  based  models.  Fig.  2  shows  the  idealized  modulator  with  linear 
components.  Table  1,  given  below,  lists  the  one-to-one  mapping  between  the  blocks  and 
the  components. 
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Delta  Sigma  Blocks 

Equivalent  Circuit  Components 

Ideal  Gm-Cell 

Voltage  controlled  current  source 

Ideal  Integrator: 

Opamp:  voltage  controlled  voltage  source 

Ideal  capacitor 

Digital  to  analog  converter 

Voltage  controlled  current  source 

Ideal  Quantizer 

Using  Sign  function  from  Verilog-A 

Ideal  Delay 

Using  Delay  function  from  Verilog-A 

Input  Source 

Independent  voltage  soince 

Clock 

Square  wave  pulse  generator 

Table  1.  Equivalent  Circuit  Components  used  to  represent  Subblocks 
Fig.3  shows  the  output  spectrum  from  the  simulation.  The  circuit  was  simulated  for  9000 
clock  cycles  and  the  frequency  has  been  normalized  to  a  sampling  rate  of  IHz  and  input 
signal  with  .025  Hz,  resulting  in  an  over-sampling  ratio  (OSR)  of  20. 


Fig.  3:  The  output  spectmm  052"“*  order  delta-sigma  modulator  using  a  Blackman  window 
We  now  list  the  verilog-A  models  of  the  quantizer  and  delay  cell  used  in  our  modulator. 
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//  VerilogA  for  NeocadLib,  quantizer2,  veriloga 

//  VerilogA  for  NeocadLib,  delay,  veriloga 

'include  "constants.h" 

'include  "discipline.!!" 

'include  "constants.h" 

'include  "discipline.h" 

module  quantizer(inp,iim,outp, onto, elk); 
input  inp,inn,clk; 
output  outp.outn; 
electrical  inn,inp,outp,outn,clk; 

module  delay_op(out,in); 
inout  out, in; 
electrical  out,in; 
parameter  real  td  =  le-3; 

parameter  real  quantizer_vth=0,0; 
parameter  real  clk_vtli=2.5; 
parameter  real  vlevel=0.5; 
parameter  real  trise=lm; 
parameter  real  tfall=lm; 

real  vout; 
real  vint; 

analog  begin 
@(initial_step)  begin 
current_time  =  0.0; 
end 

analog 

V(out)  <+  delay('V(in),td); 

endmodule 

@(cross('V(clk)  -  clk_vth,l)) 
begin 

vint  =  'V(inp,inn); 
current_time  =  Sabstime; 

//  2  input  quantizer 

if  (vint  >  quantizer_vth)  vout  =  vlevel; 
else 

vout  =  -vlevel; 

end 

V(outp)  <+  transition(vout,0,trise,tfall); 

V(outn)  <+  transition(-vout,0,trise,tfall); 
end 

endmodule 

The  above  models  are  simulated  inside  Cadence’s  spectre  simulator.  We  introduce  the 
non-idealities  in  the  next  section. 

4.2.2  Introducing  Non-Idealities 

Delta-Sigma  modulators  have  complex  dynamics  and  therefore,  most  designers  try  to 
isolate  individual  parameters  that  affect  their  performance.  The  most  significant 
performance  metric  used  is  the  signal  to  noise  ratio  (SNR)  at  the  output  of  the  modulator. 
High  performance  modulators  have  SNR  over  120  DB.  In  our  work  we  focused  on  the 
more  pressing  problem  of  DAC  noise  in  delta-sigma  modulators.  DAC  noise,  unlike  other 
noise  sources  such  as  loop  filter  noise  or  quantizer  noise,  is  not  decreased  by  the  loop 
filter  gain.  Therefore,  reducing  DAC  noise  is  a  challenging  design  problem.  To  address 
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this  problem,  we  have  introduced  two  new  non-idealities  in  the  model  described  in  the 
previous  section.  The  DAC,  which  was  modeled  using  voltage  controlled  current  source, 
is  changed  to  a  current  steering  differential  pair  of  transistors  with  an  ideal  current 
source.  The  effect  of  thermally  induced  inter-symbol  interference  is  modeled  with  a 
verilog-A  subblock  wrapped  around  the  Bipolar  junction  transistor  (BJT)  models.  The 
BJTs  are  described  with  a  Gummel-Poon  SPICE  model.  Fig.  4  shows  the  2"*^  order  delta 
sigma  modulator  with  non-ideal  DAC  and  a  thermally  induced  inter-symbol  interference 
model  wrapped  around  transistors. 


We  now  describe  thermally  induced  inter-symbol  interference  and  its  effect  on  the  overall 
performance  of  the  modulator.  Thermal-Induced  Inter-Symbol  Interference  is  caused  by 
the  following  phenomenon:  two  differential  signals  enter  a  transistor  pair  causing  the 
output  to  switch  back  and  forth  as  shown  in  Fig.  6.  If  the  signal  into  transistor  Qi 
switches  to  high  and  the  signal  into  Q2  switches  to  low,  then  Qi  conducts  current  and  Q2 
does  not.  This  causes  transistor  Qi  to  heat  up,  and  transistor  Q2  to  cool  down.  The  next 
time  the  signal  switches,  Qi  is  hotter  than  Q2.  When  a  transistor  heats  up,  it  needs  less  of 
an  input  voltage  to  turn  on  than  if  it  was  cool.  This  means  that  the  input  signal  to  Qi  must 
go  lower  to  turn  Qi  OFF,  and,  by  contrast,  the  signal  to  Q2  must  go  higher  to  turn  Q2  ON 
based  on  the  input  signals.  Because  of  this,  the  new  switching  time  is  t  later  than  it 
would  be  if  the  transistors  had  equal  temperature.  At  is  assumed  to  be  inversely 
proportional  to  the  slope  of  the  input  signal  (i.e.  1/m)  as  shown  in  Fig.  5.  If  the  slope  is 
perfectly  vertical,  then  ^  =  0,  regardless  of  the  temperature  difference.  If  the  slope  is 
gradual  then  t  would  become  more  sensitive  to  the  transistor  temperature  differences. 


Fig.  5:  Differential  DAC  Input 
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Fig. 6:  Current  Controlled  DAC 


If  Th  is  the  equilibrium  temperature  when  the  transistor  in  a  differential  pair  is  ON  and  Tl 
is  the  equilibrium  temperature  when  the  transistor  is  OFF,  then  T(n),  the  temperature  of  a 
transistor  Q  at  the  n‘^  clock  pulse  for  the  case  when  it  does  not  change  state  between 
consecutive  clock  cycles  (i.e.  s(n  - 1)  =  s(n) )  is  given  by 

T(n)  =  Tin  - 1)  +  [T„  -  Tin  - 1)](1  -  -  (1) 

for  an  ON  transistor  and 

Tin)  =  Tin  - 1)  +  [T,  -  Tin  - 1)](1  -  )  -  (2) 

for  an  OFF  transistor.  P  is  the  clock  period  and  t  is  the  thermal  time  constant.  For  the 
case  when  the  transistor  makes  a  transition  between  consecutive  clock  cycles 
(i.e.  - 1)  =  -sin)  ), 

Tin)  =  Tin  - 1)  +  [T„  -  Tin  - 1)](1  -  •  •  •  (3) 

for  a  transistor  which  has  just  turned  OFF  and 
Tin)  =  Tin  - 1)  +  [T,  -  Tin  - 1)](1  -  )  -  (4) 

for  a  transistor  which  has  turned  ON.  Using  To  =  iTn  +  Ti)l2,  T  =  iTn  -  Ti)/2,  we  can 
write  simplified  equations  for  the  differential  pair  from  Fig.  1 : 

For  the  ON  transistor  Qj 
T,in)  ^e’''%in-\)+T,i\-e^'^)+Mi\-e^'')  •  •  -(5) 
and,  for  OFF  transistor  Q2 
T^in)=e-^'%in-\)+T^i\-e’’'^)-l^i\-e‘’'^)  ■■■i6) 


Figure  7:  Base  Emitter  Voltage  (F^)  vs  Temperature 

Using  the  base-emitter  voltage  iVoe)  versus  temperature  curve  from  Fig.  7  we  determine 
the  following  relation: 
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=  (7) 

dT  r(«)-r_ 

c  is  the  rate  of  change  of  base-emitter  voltage  with  device  temperature  (mV/°C) 


/ee/2  and  Vshift(ft)  Vi}ei(n)-Vi}e2(^) 


At  the  point  were  the  DAC  switches,  7/  =  I2 

-  (8) 

T](n)  and  T2(n)  are  the  device  temperature  of  transistors  Q1  and  Q2. 
Finally  from  Eq.  6  and  fig.  2  we  find 
F....  (n) 


At(n)  = 


shift 

2m 


(9) 


To  model  thermal-induced  inter-symbol  interference,  one  needs  to  change  the  physical 
attributes  of  the  transistor  (e.g.,  saturation  current)  at  every  clock  cycle.  Dynamically 
changing  a  physical  model  in  a  SPlCE-based  simulator  is  currently  not  possible;  instead, 
we  model  this  effect  with  a  perturbation  voltage  at  the  base  of  the  transistor.  We  assume  a 
simplified  transistor  model  with 


(10) 


where  Vbeo  is  the  base-emitter  voltage  and  Vjo  is  the  thermal  voltage  at  Tnom-  Therefore 
from  Eq.  7 


n,r(o=c(m-T„,j+F,,,-(ii) 


The  perturbation  voltage  is  added  to  the  DAC  as  shown  in  Fig.  4  is  AF  = 


Figure  8:  Current  Controlled  DAC  with  voltage  perturbations 

Eq.  11  is  modeled  using  Verilog-A  and  was  simulated  in  Cadence  Spectre  simulator.  The 
Verilog-A  code  is  as  follows: 

//  VerilogA  for  NEOTFASTLib,  VJT,  veriloga 
'include  "constants. h" 

'include  "discipline.h" 

module  VJT(vminus,vplus); 
inout  vminus,vplus; 
electrical  vminus,vplus; 
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//  temperature  dependent  parameters 

parameter  real  tc  =  2e-9;  //  thermal  time  constant 

parameter  real  th  =  310;  //  equilibrium  temperature  (High) 

parameter  real  tl  =  300;  //  equilibrium  temperature  (Low) 

parameter  real  pw  =  125e-12;  //  pulse  width; 

parameter  real  c  =  1.1  le-3;  //  thermal  coefficient  [mV/degC] 

real  delt,to,tnom; 

real  real_time,prev_time, step, state; 
real  prev_devt,curr_devt,dvj; 

analog  begin 

@  ( initial_step  or  initial_step("static") )  begin 
to  =  (th  +  tl)/2; 
tnom  =  to; 
delt  =  (th  -  tl)/2; 
prev_devt  =  to; 
curr_devt  =  prev_devt; 
dvj  =  0.0; 
end 

//  Modeling  the  input  dependent  Device  Temperature 
//  rising  edge  of  the  input 
@  (cross(V(vminus),1.0))  begin 
state  =  1 ; 

step  =  Sabstime  -  prev_time; 
curr_devt  =  exp(-step/tc)*prev_devt 
+  to*(l  -  exp(-step/tc)) 

-  delt  *  (1  -  exp(-step/tc)); 
dvj  =  c*(curr_devt-tnom); 
end 

//  falling  edge  of  the  input 
@  (cross(V(vminus),-1.0))  begin 
state  =  -1; 

step  =  Sabstime  -  prev_time; 
curr_devt  =  exp(-step/tc)*prev_devt 
+  to*(l  -  exp(-step/tc)) 

+  delt  *  ( 1  -  exp(-step/tc)); 
dvj  =  c*(curr_devt-tnom); 
end 

//  At  every  clock  pulse 
@  (timer(0,pw))  begin 
step  =  0.0; 

prev_time  =  Sabstime; 
prev_devt  =  curr_devt; 
if  (state  ==  1)  begin 
curr_devt  =  exp(-pw/tc)*prev_devt 
+  to*(l  -  exp(-pw/tc)) 

+  delt  *  (1  -  exp(-pw/tc)); 

,  end 
else  begin 

currdevt  =  exp(-pw/tc)*prev_devt 
+  to*(l  -  exp(-pw/tc)) 

-  delt  *  (1  -  exp(-pw/tc)); 
end 

//  Changing  the  emitter  base  junction  potential 
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dvj  =  c*(curr_devt-tnom); 
end 

V(vminus,vplus)  <+  dvj; 
end 

endmodule 

An  SHBT  device  has  a  thermal  resistance  of  12.3C/mW.  However,  this  number  is 
complicated  by  the  fact  that  transistors  in  a  DAC  are  differential.  This  means  that  when 
one  transistor  in  the  DAC  heats  up,  a  transistor  in  close  proximity  cools  down.  The 
substrate  is  much  thicker  (about  100  microns)  than  the  distance  between  the  two 
transistors  (about  5  microns)  and  this  means  that  the  thermal  resistance  for  our  purposes 
is  actually  much  less  than  the  above  number.  This  observation  leads  us  to  use  a  thermal 
time  constant  that  is  much  less  than  the  published  value.  In  our  experiments  the  thermal 
time  constant  (t)  is  set  at  2  nsec.  Other  parameters  are  as  follows: 

Table  2:  Thermal  Induced  ISI  Model  parameters 


Quantizer  output  Rise/Fall 
Time 

50ps/50ps 

dVbe,on/dT  (c) 

1.11 

mV/°C 

Pulse  Width  (P) 

0.5  sec 

Nominal  Temperature  (To) 

310°K 

Min-Max  equilibrium  Temp 

To  measure  the  SNR  degradation  we  sweep  variables:  T  (i.e.  {Jh  -  Tj^H)  and  rate  of 
change  of  base-emitter  voltage  with  temperature  (i.e.  c).  The  simulator  Temperature  is 
made  equal  to  the  nominal  temperature  (i.e.  To).  The  plot  in  Fig.9  shows  the  input 
voltage  dependent  device  temperature  with  T=10°K,  Tnom=310°K  and  T=2nsec.  The 
plot  of  the  frequency  spectrum  from  the  output  of  the  2"^*  order  CT  E  modulator  is  shown 
in  Fig.  10  for  different  values  of  T.  The  CT  E  modulator  has  been  simulated  using 
Cadence  Spectre  simulator  for  8,000  clock  cycles  with  a  single-tone  signal  of  amplitude 
200mVat  5MHz.  The  higher  OSR  (i.e.  fs/2fo=100)  keeps  the  quantization  noise  power  low 
and  accentuates  the  effect  of  noise  generated  by  the  thermal  mismatch. 


Fig.  9:  Signal  dependent  device  temperature  change  with  T=10°lC,  Tnom=3 10°K,  7=2nsec 
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Fig.  10:  Frequency  Spectrum  at  T  at  0°K,  5°K  showing  30db  degradation  in  SNR  at  OSR=100 

Fig.  10  shows  the  effect  of  temperature  and  device  mismatches.  The  devices  have  been 
mismatched  by  10%  by  changing  /3  (beta).  Is  (saturation  current)  and  c  (Rate  of  tum-on 
voltage). 

The  device  mismatch  parameters  are  as  follows: 


Table  3;  Device  Mismatch  Parameters 


Device  Parameter 

01 

02 

Saturation  Current 
(Is) 

5e-15A 

6e-15A 

Beta  (/3) 

50 

55 

dVbe/dT 

1.1  ImV/ 
°C 

1.25mV/° 

C 

Fig.  11:  Frequency  spectram  with  1 0°K  Temperature  mismatch  and  1 0%  Device  mismatch 
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4.2.3  Conclusion 


The  results  from  the  models  show  the  dependence  of  SNR  in  a  CT  E  modulator  on  the 
thermal  characteristics  of  the  transistors.  From  Fig.  9  one  can  make  the  following 
observations.  The  SNR  degradation  in  a  continuous  mode  delta-sigma  modulator  is 
strongly  correlated  to  the  signal  dependent  temperature  mismatch  in  the  DAC.  With  5°K 
temperature  mismatch  and  slope  at  quantizer  output  of  20V/ns  (i.e.  rise/fall  time  =50ps  and 
Amplitude  =  +/-  500m V),  the  maximum  A  t  shift  is  140  femto-seconds.  This  causes  a 
SNR  degradation  of  approx  30dB  as  seen  from  Fig.  10.  With  10°K  mismatch  the  SNR 
degrades  further  by  20dB  as  shown  in  Fig.  1 1 .  Any  further  increase  in  temperature 
mismatch  can  cause  the  modulator  to  go  unstable. 

The  top-down  methodology  introduced  in  this  chapter  using  verilog-A  is  a  substantial 
refinement  over  the  older  top-down  design  methodology  used  at  HRL  which  involved 
Matlab  Simulink  and  SPICE  macro  models.  Using  verilog-A  the  transition  between 
algorithm  and  architectural  design  was  more  continuous  in  the  design  flow  and 
considerable  amount  of  guess  work  involved  in  the  transition  to  the  transistor  level  design 
was  reduced. 
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Chapter  5  Parasitic  Extraction  Tool 


5.1  Introduction 

The  purpose  of  the  proposed  parasitic  extraction  tool  was  to  estimate  in  a  rigorous 
manner  the  network  parameterization  of  the  passive  interconnect  structures  of  a  mixed 
signal  system  using  a  fast  frequency-domain  fast  full-wave  electromagnetic  simulator. 
The  underlying  technique  is  a  high-order  method  of  moment  solution  with  a  novel  fast 
iterative  solution  method,  referred  to  as  the  Quadrature-Sampled  Pre-Corrected  FFT  (QS- 
PCFFT)  method.  In  this  section,  the  fundamental  integral  equation  formulation  and 
parameter  extraction  method  is  outlined.  The  QS-PCFFT  algorithm  is  also  detailed. 
Validating  results  and  benchmarking  of  the  tool  are  also  provided. 

5.2  Method  of  Moment  Formulation 

The  intercormect  networks  of  the  mixed-signal  circuit  is  assumed  to  consist  of  a  network 
of  metallic  signal  and  reference  lines  and  ground  planes  etched  within  a  layered  dielectric 
media.  Metal  losses  are  modeled  via  an  impedance  boundary  condition  (non-penetrable 
conductors),  or  resistive  boundary  condition  (penetrable  conductors).  The  integral 
equation  formulation  used  for  this  physical  problem  is  the  mixed  potential  integral 
equation  (MPIE): 

t  ■  (F)  =  Z/  ■  J (F) - r  •  jp,  (F, F') •  J (r)ds' - 1  •  jjVG,  (F, r')V -J (F') ds'  (0.1) 

5  S 

where  is  the  dyadic  vector  potential  and  Gy  the  scalar  potential  for  a  layered  media, 
F  is  a  vector  tangential  to  the  conductor  surface  S,  and  is  the  surface  impedance  of  the 

metallic  surface.  The  dyadic  vector  potential  and  the  scalar  potential  are  specialized  to 
planar  layered  media  assumed  to  be  infinite  in  extent  [1,2]  and  can  be  expressed  in  the 
form  of  Sommerfeld  integrals.  In  the  code  developed,  the  Sommerfeld  integrals  were 
pre-computed  using  a  deformed  path  integration  [3],  tabulated,  and  interpolated  to 
controllable  accuracy. 

The  unknown  surface  current  density,  J{r'),  is  calculated  using  a  method  of  moment 

solution.  The  surface  is  discretized  via  either  curvilinear  quadrilaterals  or  triangles.  The 
unknown  surface  current  density  is  approximated  via  high-order  divergence  conforming 
basis  functions  proposed  by  Graglia,  Wilton  and  Peterson  [4].  A  Galerkin  testing 
procedure  is  employed.  Consequently  the  basis  and  testing  functions  are  identical.  This 
leads  to  a  discrete  linear  system  of  equations: 

Zj  =  v  (0.2) 

where, 

Z„,,„  =  |JF„,  {r)-]„{r)ds-  JJr„  (F)  •  (F,  F')  •  J„  (F')  ds' 

S,„  S,„  s„ 

-  jj^Gy{r,r)Y.l{r')ds',  (0.3) 

S„ 

=  l\Li^)-E'"{r)ds 
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j  represents  the  unknown  constant  coefficient  vector  weighting  the  basis  functions,  is 
the  surface  supporting  the  vector  test  function  4  and  is  the  surface  supporting  the 
vector  basis  function  .  Finally,  for  low  frequency  stabilization,  loop-star  basis 
functions  are  used  [5].  The  loop  star  basis 


5.3  Network  Parameter  Extraction 


The  method  of  moment  solution  presented  in  the  previous  section  is  used  to  extract  the 
admittance  parameters,  or  Y-parameters,  of  the  mixed  signal  system  network.  In  order  to 
extract  the  Y-parameters,  one  needs  to  drive  a  port  with  an  ideal  voltage  source,  while 
short  circuiting  all  the  other  loads  with  an  ideal  short  circuit.  Then,  by  calculating  the 
short  circuit  currents  at  all  ports,  a  column  of  the  Y-parameters  is  calculated.  One  of  the 
challenges  of  using  a  method  of  moment  solution  is  that  we  cannot  emulate  an  ideal  short 
circuit,  since  a  finite  shorting  pin  or  strip  has  parasitic  inductance  and  possible  radiation 
loss.  Also,  one  cannot  exactly  reproduce  an  ideal  voltage  source.  As  a  consequence,  we 
need  to  identify  appropriate  standards  an  appropriate  de-embedding  techniques  so  that 
non-physical  parasitic  affects  can  be  de-embedded  from  the  estimation  of  the  network 
parameters  of  the  device  under  test.  A  novel  procedure  we  have  developed  to  do  this  is 
briefly  described  in  this  section  for  microstrip  lines.  Similar  procedures  can  be  applied  to 
other  types  of  waveguides,  such  as  co-planar  waveguides,  slot-lines  or  striplines. 

Consider  an  A-port  network  fed  by  N  microstrip  signal  lines.  A  port  extension  of 
length  f  is  then  added  to  each  line  beyond  the  desired  reference  plane  of  the  port.  Each 
port  extension  is  terminated  with  an  open  circuit.  A  delta-gap  source  is  placed  a  distance 
d  from  the  reference  plane.  Fig.  5.1(a)  illustrates  the  transmission  line  model  for  the  port 
extension  and  an  equivalent  circuit  model.  In  the  equivalent  circuit  model,  represents 
the  equivalent  input  impedance  of  the  open  circuited  stub  line  beyond  the  delta-gap 
source,  and  [A^]  is  the  ABCD,  or  transmission,  matrix  of  the  uniform  line  of  length  d. 


Both  Zj  and  [A^]  are  extracted  using  a  method  of  moment  simulation  of  the  actual  port 


extensions,  where: 


A  B 
C  D 


(0.4) 


Since  the  port  extensions  are  small,  this  operation  requires  nearly  negligible  computer 
resources. 


O.C. 


Fig.  5.1  Port  extension  for  the  i 


Reference 

d  Plane 

- w 


>• 


(b) 


port:  (a)  equivalent  transmission  line,  (b)  equivalent  circuit  model. 
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Once  the  port  extension  parameters  are  computed,  the  7-matrix  of  the  N-port  network 
is  extracted.  As  an  example,  we  consider  a  four-port  network.  In  this  simulation,  each  of 
the  four  ports  will  be  excited  separately.  Consider  the  excitation  of  port  1,  by  a  delta-gap 
voltage  source.  For  the  remaining  ports,  the  delta-gap  source  is  short  circuited.  The 
method  of  moment  simulation  is  then  used  to  compute  the  currents  in  the  reference  plane. 
The  voltage  at  each  port  cannot  be  measured  directly  from  the  method  of  moment 
simulation.  Rather,  the  port  voltages  are  then  estimated  using  the  equivalent  circuit 
models  in  Fig.  5.1(b).  To  this  end,  the  voltage  on  the  A:-th  port  is  estimated  as: 

V,=-I\zr  (0.5) 

where,  l\  represents  the  current  flowing  into  the  A:-th  port  due  to  the  delta-gap  source 


voltage  on  port  1,  and  Zf‘  represents  the  effective  input  impedance  of  the  A:-th  port 

extension  as  extracted  from  the  equivalent  circuit  parameters  in  Fig.  1  (b).  Next,  the  port 
voltage  on  port  1  in  the  reference  plane  is  computed  as: 

y.=F's=(y,-2'A-B,I\)lA,  (0.6) 

where  Z]  is  the  input  impedance  of  the  open  circuit  stub  for  port  1 ,  l\  is  the  current  at 

the  delta-gap  source  as  computed  by  the  MoM  simulation,  /,'  is  the  current  in  the 


reference  plane  of  port  1  as  computed  by  the  MoM  simulation,  and  and  5,  are  ABCD 
parameters  of  the  feed  line  in  Fig.  1.  Given  these  parameters,  the  7-matrix  is  then 
expressed  as: 


(0.7) 


Next,  a  similar  calculation  is  performed  for  the  remaining  three  ports.  Consequently, 
from  this  simulation,  the  A:-th  column  of  the  7-matrix  can  be  computed  from  the  linear 
system  of  equations: 
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where  II  represents  the  current  in  the  reference  plane  of  the  y-th  port  due  to  the 


excitation  of  port  k  with  a  delta-gap  source,  and  from  (0.6)  Fg  =  -  BJl )  /  . 

Finally,  once  the  7-parameters  are  computed  the  scattering  parameters  can  be  computed 
for  the  A-port  network  directly  given  the  reference  impedance  for  each  port. 


5.3  Validation  of  Parasitic  Extraction  Tool 

To  validate  the  parasitic  extraction  tool,  the  scattering  parameters  are  extracted  from 
the  microstrip  low-pass  filter  in  Fig.  5.2.  The  scattering  parameters  predicted  by  the 
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present  method  are  eompared  against  those  predieted  using  Zeland  IE3D™  in  Fig.  5.3. 
Both  the  magnitude  and  Phase  eompare  quite  well  for  the  two  schemes. 

A  second  example  is  a  microstrip  meander  line,  which  is  a  test  structure  developed  by 
HRL  Laboratories.  The  structure  is  illustrated  in  Fig.  5.4  (a)  and  (b).  The  meander  line 
circuit  has  a  width  of  400  ptm  and  length  of  about  753  /rm.  The  total  length  of  the 
meander  line  is  5055  jum.  The  microstrip  line  is  3  jUm  wide,  and  is  printed  under  a  700 
fim  thick  InP  layer  {£^=12,  tan ^  =  .0005)  and  on  a  1.95  /rm  thick  Polyimid  layer 

(e^  =  4 ,  tan^  =  .001 )  backed  by  a  conductor.  The  scattering  parameters  were  extracted 

over  the  frequency  range  of  0.1  -  75  GHz  using  the  University  of  Kentucky  (UKY)  code 
as  well  as  Zeland  IE3D™.  The  results  are  compared  in  Fig.  5.4  (c).  There  is  some 
deviation  in  the  results.  It  has  been  concluded  that  the  deviation  is  due  to  the  surface 
impedance  approximation.  In  the  UKY  code,  the  broad  band  surface  impedance  is  based 
on  the  formulation  presented  in  [6].  The  technique  employed  by  IE3D  is  unknown. 

A  third  example  is  that  of  a  6-pole  elliptic  microstrip  filter  with  folded  hair-pin 
resonators.  The  filter  is  printed  on  a  32  mil  substrate  with  £^  =  3.38 .  Again,  the 

scattering  parameters  were  extracted  using  the  present  method  and  are  compared  to  those 
computed  via  Zeland  IE3D™.  These  results  are  illustrated  in  Fig.  5.5.  Again,  quite  good 
agreement  is  realized. 


89 


Fig.  5.2  :Top  view  of  a  50  Q  microstrip  low-pass  filter  printed  on  a  10  mil  Alumina  substrate  =  9.8 ). 


0  5  10  15  20  25  30  35  -10 

Frequency  (GHz) 


0  5  10  15  20  25  30  35  'lO 


Frequency  (GHa) 

Fig.  5.3  Scattering  parameters  of  the  double  stub  filter  compared  against  IE3D. 
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InP:  £r^=  12,  tanS  =  0.0005  ' 

Polyimid:  a'  =  4,  tanS  =  0.001 
Conductors:  a  =  4.1e+7  S/m 
Strip  width  =  3  pm 


(a) 


- - OKY  dB(S(U))  — • —  1E3D 

- m —  UKY  dBCS(2.1I)  - * - IB3D«»B}S[2,l)I 


Fig.  5.4:  HRL  test  structure  -  the  “meander  microstrip  line.”  (a)  cross  section  of  substrate  and  microstrip 
line,  (b)  top  view  of  the  meander  line,  (c)  scattering  parameters  computed  via  the  UKY  simulation 

code  and  Zeland  IE3D™. 
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Fig.  5.5:  6  pole  elliptic  microstrip  bandpass  filter  printed  on  a  32  mil  substrate  (e^  =  3.38 )  and  the 
scattering  parameters  computed  by  the  UKY  software  tool  (Sdhogal)  as  compared  to  Zeland  IE3D. 


5.4  Quadrature  Sampled  Pre-Corrected  FFT  Method 

The  computational  resources  required  to  simulate  a  large  structures  using  a  method  of 
moment  procedure  can  be  significantly  reduced  via  a  fast  solution  procedure.  When 
employing  a  high-order  scheme,  it  also  becomes  imperative  to  employ  a  solution 
procedure  that  is  fast,  and  maintains  the  high-order  nature  of  the  algorithm.  In  this 
research  program,  we  have  developed  a  new  fast  iterative  solution  method,  referred  to  as 
the  Quadrature  Sampled  Pre-Corrected  FFT  (QSPCFFT)  [7,  8].  The  QSPCFFT  method 
is  in  the  same  class  of  methods  as  the  Adaptive  Integral  Method  (AIM)  [9-11]  or  the  Pre- 
Corrected  FFT  (PCFFT)  method  [12].  The  specific  advantage  of  the  QSPCFFT  method 
is  that  it  does  not  require  the  computation  of  moments  to  project  the  currents  to  the 
uniform  grid.  Rather,  the  discrete  Fourier  transform  (DFT)  of  the  current  is  evaluated 
directly  using  the  discontinuous  FFT  [13]  [7],  which  is  based  on  a  quadrature  sampling  of 
the  currents.  The  advantage  of  this  scheme  is  that  the  DFT  of  the  current  is  computed  to 
controllable  accuracy  and  is  exponentially  convergent. 

The  QSPCFFT  is  used  to  accelerate  an  iterative  solution  of  the  method  of  moment 
matrix.  Each  entry  of  the  impedance  matrix  is  expressed  as  the  irmer-product  of  the  test 
vector  with  the  convolution  of  the  MPIE  kernel  with  the  basis  function.  The  QSPCFFT  is 
used  to  accelerate  these  reactionary  terms  in  the  regions  where  the  field  cell  supporting 
the  test  vector  and  the  source  cell  supporting  the  basis  function  vector  are  sufficiently  far 
removed  such  that  the  kernel  is  sufficiently  smooth.  In  this  “far-field”  region,  the 
convolution  is  then  computed  very  efficiently  by  the  inverse  Fourier  transform  of  the 
product  of  the  Fourier  transform  of  the  arbitrary  current  density  (computed  using  the 
discontinuous  FFT  [13])  with  the  Fourier  transform  of  a  “windowed”  kernel  (evaluated 
using  a  standard  FFT).  To  describe  this  in  operator  form,  the  impedance  matrix  is 
decomposed  as: 

=  +  (0.9) 
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Where  is  a  sparse  matrix  that  represents  interactions  of  closely  separated 

source/observation  cell  pairs,  and  represents  the  reaction  of  souce/observation  cell 
pairs  that  are  sufficiently  far  removed,  and  j  is  the  unknown  vector  of  the  current 
coefficients.  The  discontinuous  FFT  presented  in  the  previous  section  will  be  used  to 
evaluate  the  product  .  To  this  end,  this  product  can  be  computed  as: 

Z/-j  =  V^Fr(Wj)  (0.10) 

where,  W  represents  a  matrix  operator  that  samples  the  currents  with  coefficient  vector  j 
at  the  quadrature  points,  and  maps  the  vector  to  g  defined  by  the  uniform  FFT  grid  [7],  H 
is  an  operator  defined  as: 

//(g)  =  FFr'{FF’r(K)-F’Fr(g)}  (0.11) 

where  K  is  the  discrete  kernel,  and  maps  the  uniformly  sampled  field  points  back  to 
the  quadrature  points  and  performs  the  inner  dot  product  with  the  test  vector  using  a 
fixed-point  numerical  quadrature.  Due  to  the  symmetry  of  the  operation,  it  can  be  shown 
that  V  =  W  .  Also,  from  a  practical  perspective,  the  discrete  kernel  K  is  windowed  such 
that  the  numerical  singularity  near  the  origin  is  removed.  Furthermore,  FFT{K)  is 
performed  only  once  and  stored. 

The  operator  in  (0.10)  is  a  global  operator  that  computes  all  source-observation  pairs, 
and  hence  corrupts  the  near  fields.  Thus,  the  near  interaction  must  be  "pre-corrected." 
This  is  done  as: 

Z"“"=Z"“^-W/W^  (0.12) 

It  is  noted  that  this  can  be  done  efficiently  and  accurately  during  the  matrix  fill 
analytically  via  a  discrete  convolution,  or  more  efficiently  using  localized  FFT's.  Finally, 
the  matrix  vector  product  is  approximated  as: 

Zj  =  Z""''l  +  W/A^^j  (0.13) 

resulting  in  an  algorithm  that  requires  0(MogiV)  floating  point  operations  and  0{N) 
storage. 

To  demonstrate  the  QSPCFFT  algorithm,  consider  a  microstrip  patch  array  antenna,  as 
illustrated  in  Fig.  5.6.  The  antennas  are  printed  on  a  31  mil  substrate  with  =  4.4 .  The 
scattering  parameters  of  the  2  element  array  are  also  illustrated  in  Fig.  5.6  and  are 
compared  to  those  computed  via  Zeland  IE3D™.  The  UKY  fast  solver  was  then  used  to 
simulate  the  array  as  the  #  of  elements  was  increased  from  2  to  16.  The  solution  was 
obtained  via  the  present  method  and  the  QSPCFFT  algorithm  with  a  BiCGSTAB(l) 
iterative  solver  [14]  with  1  =  4.  The  CPU  times  required  to  compute  the  near  field  matrix 
and  its  pre-corrections  are  graphed  in  Fig.  5.7  as  a  function  of  N.  Also,  the  CPU  time  per 
iteration  is  also  presented.  It  is  noted  that  on  average  ~65  iterations  per  port  excitation 
was  required  to  reach  a  converged  solution.  The  memory  required  by  the  QSPCFFT 
algorithm  is  presented  in  Fig.  5.7  as  well  and  is  compared  to  that  required  by  a  full-direct 
solve  (assuming  complex  double  precision  for  both  approaches).  Observing  these  results, 
it  is  seen  that  the  fill  time  scales  close  to  0{N),  the  cost  per  iteration  scales  close  to 
C)(MogA0,  and  the  memory  as  0{N),  as  expected  by  the  algorithm. 

A  final  example  is  a  spiral  inductor  array  loaded  microstrip  line,  illustrated  in  Fig.  5.8 
(a).  The  line  is  printed  on  a  10  pm  InP  substrate.  Each  inductor  has  a  radius  of  200  pm. 
The  spacing  between  inductor  load  pairs  is  also  200  pm.  The  scattering  parameters  were 
extracted  via  the  QSPCFFT  accelerated  method  of  moment  solution  over  the  frequency 
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range  of  0.1  -  100  GHz.  These  results  are  illustrated  in  Fig.  5.8  (c).  These  results  are 
also  compared  to  those  computed  via  Zeland  IE3D.  Excellent  agreement  is  observed. 
Results  obtained  via  the  QSPCFFT  solution  are  compared  to  those  obtained  via  a  direct 
solver  in  Fig.  5.8  (d).  The  results  compare  to  better  than  2  digits. 
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Fig.  5.6  2-element  microstrip  patch  antenna  array  filter  printed  on  a  31  mil  substrate  ~  4-4 )  and  the 
scattering  parameters  computed  by  UKY  fast  solver  and  Zeland  IE3D. 
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Fig.  5.7  Scaling  of  CPU  time  and  memory  as  the  patch  array  in  Fig  2  is  increased  from  2  to  16  patch 
anteimas.  The  memory  compares  the  QSPCFFT  solution  with  a  direct  solution  method  with  full- 
matrix  storage. 
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Fig.  5.8  Spiral  inductor  loaded  microstrip  line,  (a)  top  view,  (b)  cross  section  of  the  microstrip  line  and 

substrate,  (c)  comparison  of  the  UKY  QSPCFFT  solution  and  Zeland  IE3D,  and  (d)  comparison  of 
QSPCFFT  and  direct  LU  factorization  solutions  via  the  UKY  software. 
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CHAPTER  6  IMPACT  ON  MIXED  SIGNAL  1C  DESIGN  FLOW  AND 

VALIDATION 


6.1  Impact  And  Validation 

Despite  the  huge  investment  in  CAD  software  infrastructure,  analog  design  is  still  largely 
a  manual  and  oftentimes  painful  process  that  gets  exponentially  harder  as  the  circuit 
becomes  more  complex.  NeoCAD  was  designed  to  directly  impact  this  problem  by 
enabling  a  faster  process  flow  that  leads  to  fewer  errors  at  tape-out. 

To  understand  the  impact  of  the  NeoCAD  effort,  we  first  look  at  a  standard  analog  IC 
design  flow  in  Fig.  6.1.  The  design  progresses  from  a  system  architecture  to  a  functional 
design,  to  a  transistor-level  circuit  design.  The  circuit  is  then  laid  out,  and  based  on  the 
parasitics  of  the  layout,  the  circuit  is  once  again  modified.  The  process  is  quite  lengthy 
and  entails  many  compromises.  For  example,  if  a  fixed  amount  of  time  and  resources  are 
available  to  design  the  circuit,  the  circuit  may  not  be  fully  tested  before  release; 
alternately,  if  the  circuit  is  carefully  tested,  the  design  may  take  too  long  to  complete. 

The  major  bottleneck  in  testing  and  verification  is  the  simulation  times.  By  reducing  the 
simulation  times,  we  can  make  more  tests  in  a  given  time  and  hence  improve  our  chances 
of  first-pass  success.  NeoCAD  addressed  fundamental  issues  in  each  of  these  blocks  to 
improve  speed  and  accuracy.  While  preceding  sections  cover  the  results  of  the  NeoCAD 
project  in  more  detail,  this  section  provides  selected  examples  to  illustrate  its  impact  on 
the  design  flow. 


Fig.  6.1;  Analog  circuit  design  flow. 

6.1.1  Functional  Design  Improvements 

Fig.  6.2  shows  the  design  flow  before  and  after  NeoCAD.  Before  the  NeoCAD  project, 
we  created  a  circuit,  such  as  a  AZ  modulator,  first  at  the  ideal  behavioral-level,  and  after 
verification,  moved  directly  to  a  transistor-level  model  that  we  simulated  in  Cadence. 

This  jump  led  to  many  problems.  First,  verification  at  the  transistor-level  was  very 
lengthy  because  the  circuits  tend  to  be  very  complex  and  take  a  long  time  to  simulate. 
Because  the  verification  is  so  lengthy,  it  became  time-prohibitive  to  fully  simulate,  debug 
and  optimize  the  transistor-level  circuit.  As  a  result,  there  is  a  reasonable  likelihood 
problems  will  not  get  caught  before  the  design  is  released  to  fabrication. 
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to  optimize  performance  from  an  idealized  AS  modulator.  This  model,  however,  does  not 
include  effects  such  as  quantizer  metastability  that  result  from  the  latch  RC  time  constant. 
As  the  joint  effect  of  the  RC  time  constant  and  Qn,  on  the  SNR  is  currently  not  solvable 
analytically,  we  would  nominally  run  a  series  of  simulations  to  see  the  effect  of  this 
nonideality  on  the  SNR.  However,  the  simulation  times  are  prohibitive  to  do  such  an 
analysis  at  the  transistor  level. 

We  are,  however,  able  to  perform  the  simulations  using  the  non-ideal  behavioral-level 
models  developed  under  NeoCAD.  It  shows  that  the  original  value  of  Qfb  does,  in  fact, 
lead  to  optimal  SNR  values.  However,  small  positive  deviations  in  Qfb  lead  to  a  large 
degradation  in  the  SNR,  and  oftentimes,  causes  the  circuit  to  go  unstable.  By  simply 
reducing  the  value  of  Qfb,  the  SNR  drops  slightly  but  the  circuit  becomes  much  more 
robust.  Thus,  as  a  result  of  the  nonlinear  behavioral-level  models,  we  were  able  to  make 


our  circuit  more  robust  than  would  be  possible  using  the  conventional  design 
methodology. 


Fig.  6.4  :  SNR  of  AS  modulator  as  function  of  Qn,  and  latch  RC  time  constant. 


It  is  important  to  recognize  the  general  utility  of  these  models.  There  are  currently  very 
few  bandpass  AS  modulators  being  designed  in  industry.  However,  these  AS  modulators 
are  designed  using  basic  building  blocks  that  are  very  common  throughout  the  industry. 
Because  of  this,  effort  developing  the  nonlinear  models  can  pay  off  in  a  variety  of  other 
circuits.  Fig.  6.5,  for  example,  shows  a  tunable  DAC  mismatch  shaper.  This  circuit  is 
used  to  improve  the  SNR  of  a  DAC  by  shifting  the  mismatch  error  spectrum  out-of-band. 
While  performing  a  largely  different  function,  there  are  many  common  blocks  that  have 
nonlinear  models  developed  on  the  NeoCAD  project. 
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Have  plug-in  NeoCAD  models 

Have  NeoCAD  models  that  need  revision 


Fig.  6.5:  DAC  mismatch  shaper  circuit. 

6.1.2  Simulation  Time  Improvements 

A  major  effort  of  the  NeoCAD  project  is  to  reduce  the  simulation  time  for  a  given  set  of 
models.  HAARSPICE  could  enable  lOX  faster  simulations  under  a  given  set  of  test 
conditions.  This  speed  improvement  manifests  itself  in  two  ways.  First,  the  design  cycle 
time  can  decrease.  Second,  for  a  given  design  time  allowed,  the  HAARSPICE  speed 
improvement  can  lead  to  more  design  iterations  and  a  more  robust  circuit. 

We  see  an  example  of  this  in  Fig.  6.6,  which  shows  the  output  of  a  2-tone  test.  In  Fig.  6.6 
(a),  time  constraints  limit  the  run  length  of  the  time-domain  simulation,  and  the  FFT 
yields  little  information.  The  two  tones  are  not  yet  clearly  distinguished.  By  contrast,  an 
8X  longer  simulation  time  can  yield  an  FFT  shown  in  Fig.  6.6  (b).  The  input  tones  and 
third-order  harmonics  are  clearly  visible.  Because  of  the  accurate  determination  of  the 
nonlinearity,  it  is  easier  to  debug  and  optimize  the  circuit. 


(a) 


(b) 


Fig.  6.6:  FFT  of  time-domain  output  for  (a)  8K  clock  cycles,  and  (b)  64K  clock  cycles.  The  simulation 
yields  far  more  information  with  more  FFT  points. 
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6.1.3  Parasitic  Extraction  Improvements 

After  transistor-level  design  and  layout,  simulations  are  performed  to  see  the  effect  of 
parasitic  resistances,  inductances,  and  capacitances  on  the  performance  of  the  circuit. 

Fig.  6.7  shows  the  flow  before  and  after  BEMP.  Before  BEMP,  parasitic  extraction  was 
performed  with  Cadence  or  Sonnet.  Both  of  these  programs  had  severe  shortcomings. 
Cadence  has  a  piecewise  model  that  models  each  wire  section  as  a  circuit  element.  If 
there  are  many  sections  to  the  wire,  the  resultant  model  becomes  extremely  complicated. 
Secondly,  long  sections  are  treated  with  the  same  precision  as  short  sections,  and  hence 
the  models  of  the  long  sections  are  less  precise  than  the  short  sections,  even  though  the 
long  sections  need  to  be  more  precise.  Thirdly,  an  LRC  extraction  causes  so  many  nodes 
that  realistic  post-layout  simulations  cannot  be  performed  on  complex  circuits.  Finally, 
Cadence’s  extractor  functions  in  such  a  way  that  only  nearest-neighbor  interactions  can 
be  modeled.  Sonnet’s  EM-based  software  avoids  these  drawbacks  but  has  an  overly 
simplified  model  that  uses  only  one  inductor,  capacitor  and  resistor  between  a  line’s  input 
and  output.  The  model  is  hence  accurate  at  only  one  frequency.  BEMP,  by  contrast,  uses 
an  EM-based  distributed  model  that  is  accurate  across  a  wide  bandwidth.  Importantly, 
this  increase  in  accuracy  does  not  seem  to  cost  a  significant  amount  of  simulation  time. 

In  fact,  the  BEMP-based  model  could  offer  up  to  a  1  OX  reduction  in  simulation  time  over 
those  performed  using  Cadence’s  LRC-based  parasitics. 


Fig.  6.7:  Verification  process  before  and  after  BEMP. 

To  verify  the  accuracy  of  BEMP  extraction  vs.  Sonnet  extraction,  we  used  a  32-bit  bus 
line  on  an  ADC,  shown  in  Fig.  6.8. 
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Fig.  6.8:  Bus  line  on  ADC.  Width  of  structure  is  about  800  |rm. 


A  tapered  square  wave  was  then  fed  into  this  bus  to  test  for  ringing.  The  results  are 
shown  in  Fig.  6.9.  The  BEMP  response  is  somewhat  obscured  because  the  peaks  lie  on 
the  lines  1 .0  and  -1.0,  but  overall  show  far  less  ringing  with  the  Sonnet  model.  This  then 
raises  the  question  of  which  model  is  more  accurate.  Further  runs  with  Sonnet 
demonstrated  that  the  ringing  was  highly  dependent  on  the  frequency  from  which  the 
parasitics  were  extracted.  This  is  a  natural  problem  which  results  from  an  extremely 
simple  one-stage  LRC  model  used  by  Sonnet.  The  Sonnet  model,  extracted  at  high 
frequencies,  generated  very  little  ringing.  Since  the  voltage  transitions  have  very  high- 
frequency  components,  this  supports  the  results  from  BEMP  model  that  show  the  line  has 
little  true  ringing.  Because  of  this  result,  the  bus  was  determined  to  be  satisfactory  and  an 
otherwise  lengthy  optimization  process  was  avoided. 


Transient  Response 

SONNET 


Fig.  6.9:  Output  of  square  pulse  on  bus  using  Soimet  and  BEMP  models. 


6.2  Conclusions 


NeoCAD  led  to  several  key  improvements  in  the  analog  design  flow.  By  implementing 
non-ideal  behavioral-level  models,  NeoCAD  allowed  for  a  good  deal  of  optimization  to 
be  performed  early  in  the  process.  Using  HAARSPICE,  simulations  could  run  up  to  lOX 
faster,  enabling  more  simulations  in  a  fixed  design  time.  Finally,  BEMP  allowed  faster, 
more  accurate  parasitic  extraction  than  was  available  using  Cadence  or  Sonnet.  As  a 
result  of  these  efforts,  NeoCAD  enabled  HRL  to  design  more  robust  circuits  faster. 
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Introduction 


HAARSPICE  circuit  simulator  is  a  suite  of  programs  used  for  Time  domain  and 
DC  analyses  of  linear  and  nonlinear  analog  circuits  and  systems.  HAARSPICE 
was  developed  at  HRL  LLC  (Malibu,  CA)  under  the  DARPA-NEOCAD  contract.  \ 

HAARSPICE  suite  of  programs,  run  on  Linux  7.02  and  have  a  text-based 
interface.  HAARSPICE  accepts  ascii  text  netlist  of  electronic  circuits  generated 
using  Cadence’s  Schematic  capture  environment. 

HAARSPICE  Package  Programs 

Typing  in  “haarSPICE”  on  the  command  line  prompt,  the  following  options  are 
displayed 

enl  -  Eigenspace  non-linear  solver  (jacobian) 
eld  -  Eigenspace  linear  domain  solver 

ese  -  Eigenspace  evolution  nl-domain  solver (jacobian  free) 
dcs  -  DC  sweep 

snlc-  Supported  netlist  components 
dcp  -  DC  Point  of  circuit  at  t  =  0 


In  the  next  sections  we  shall  go  over  each  of  them  briefly  and  with  the  aid  of  an 
example  highlight  their  attributes.  The  example  used  is  a  4*^  order  continuous 
time  band-pass  delta-sigma  modulator  designed  at  HRL. 


Components 


HAARSPICE 


Being  a  Research  code  HAARSPICE  is  often  updated  to  support  more 
components.  The  current  status  of  HAARSPICE  supported  components  can  be 
checked  by  typing  snic  (i.e.  supported  netlist  components),  on  the  command  line 
prompt. 

available  netlist  components: 

tag  (...)  atod|AtoD  parameters  ??? 
tag  (...)  bjt|NPN|PNP  parameters  ??? 

tag  (nodeA  nodeB)  capacitor  c=value  "value  of  capacitance  between 
nodeA  and  nodeB" 

tag  (...)  cccs I cccs_hdl  parameters  ??? 

tag  (...)  ccvs  parameters  ??? 

tag  (...)  delay  I delay_op  parameters  ??? 

tag  (...)  quantizer! I differentialQuantizer  parameters  ??? 
tag  (...)  diode  I diode_hdl I diode_sch  parameters  ??? 
tag  (...)  dsmod I  delta- sigma-modulator  parameters  ??? 
tag  (...)  dtoa|DtoA  parameters  ??? 

tag  (nodeA  nodeB)  inductor  l=value  "value  of  inductance  between 
nodeA  and  nodeB" 

tag  (...)  isource I csource  parameters  ??? 
tag  (...)  quant izerl I  quantizer  parameters  ??? 

tag  (nodeA  nodeB)  resistor  r=value  "value  of  resistance  between 

nodeA  and  nodeB" 

tag  (...)  sign  parameters  111 

tag  (nodeAl  nodeA2  nodeBl  nodeB2)  transformer  nl=value  n2=value2 

"valuel/value2  is  gain  between  nodes  A  and  nodes  B" 

tag  (...)  vccs  parameters  111 

tag  (...)  vcvs  parameters  111 

tag  (...)  vsource  parameters  111 

???  denotes  instance  parameters 


Using  the  listed  components,  the  user  can  generate  the  netlist  of  circuits.  In  the 
example  shown  we  have  the  netlist  generated  with  Cadence’s  schematic  capture 
software.  Currently  we  support  linear  and  non-linear  components.  The  devices 
supported  are  PN  Junction  Diode,  NPN  and  PNP-type  Bipolar  Junction 
Transistor  (BJT) 

The  models  used  are  large-signal  nonlinear  and  can  simulate  DC  and  transient 
domain  circuit  responses.  For  BJT  we  use  a  Large  Signal  Gummel-Poon  type 
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model  and  for  the  diode  a  charge  based  model  has  been  used.  The  next  section 
gives  detail  description  of  each  component  and  its  attributes. 


QJJ231I3 


Netlist  Generated  using  Cadence’s  Schematic  Capture  Tool 


parameters  Qfb=-0 . 5*164 . 6m  K4=-0 . 5*82 . 55u  K3=0 . 5*3 . 332m  K2=0 . 5*1 . 933m  \ 
K1=0 . 5*4 . 833u  gmb2=-l. 0*6. 383m  gmbl=-l . 0*6 . 383m  gm4=0 . 5*6 . 383m  \ 
gm3=0.5*2.5m  gm2=0 . 5*6 . 3 83m  gml=0 . 5*1 . 66m  C4=lp  C3=lp  C2=lp  Cl=lp  \ 
A4=-100  A3=-100  A2=-100  Al=-100 
parameters  mgml=-gml 
parameters  mgm2=-gm2 
parameters  mgm3=-gm3 
parameters  mgm4=-gm4 
parameters  mAl=-Al 
parameters  mA2=-A2 
parameters  mA3=-A3 
parameters  mA4=-A4 
//  Library  name:  eelOlLib 
//  Cell  name:  balun 
//  View  name:  schematic 
subckt  balun  c  d  n  p 

_inst0  (dope)  transformer  nl=2 
KO  (d  0  c  n)  transformer  nl=2 
ends  balun 

//  End  of  subcircuit  definition. 

//  Library  name:  NeocadLib 

//  Cell  name:  SD4D 

//  View  name:  schematic 

139  (0  net0291  net2  netl)  balun 
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113  (netl56  netl24  nets  netV  netl68)  quantizers  quantizer_vth=0 . 0  \ 
clk_vth=0 

G7  (net44  0  netl04  netll2)  vccs  gm=mgm4 
G6  (net56  0  _net3  0)  vccs  gm=K3 
Gmb2  (net56  0  net46  0)  vccs  gm=gmb2 
GMBld  (net60  0  netSO  0)  vccs  gm=gttibl 


G4 

(netS4 

0 

_net3  0) 

VCCS 

gm=K2 

G3 

(netSS 

0 

net82  netSO)  vccs  gm=mgm3 

G8 

(net44 

0 

_net3  0) 

vccs 

gm=K4 

GMld  (netSO 

0  netl 

_net2 ) 

vccs  gm=mgml 

GM2d  (net64  0  net98  net94)  vccs  gm=mgm2 

DACld  (net60  0  _net3  0)  vccs  gm=Kl 

GMl  (net68  0  _netl  _net2)  vccs  gm=gml 

DACl  (net68  0  _netO  0)  vccs  gm=Kl 

DAC2  (net72  0  _netO  0)  vccs  gm=K2 

GM2  (net72  0  net98  net94)  vccs  gm=gm2 

DAC3  (net76  0  _netO  0)  vccs  gm=K3 

GM3  (net76  0  net82  netSO)  vccs  gm=gm3 

GM4  (netSO  0  netl04  netll2)  vccs  gm=gm4 

DAC4  (netSO  0  _netO  0)  vccs  gm=K4 

Gmbl  (net68  0  net82  0)  vccs  gm=gmbl 

G12  (net76  0  net86  0)  vccs  gm=gmb2 

C7  (net44  net46)  capacitor  c=C4 

C2d  (net64  netSO)  capacitor  c=C2 

Cld  (neteo  net94)  capacitor  c=Cl 

Cil  (net68  netSO)  capacitor  c=100.0f 

_inst4  (netSS  net98)  capacitor  c=Cl 

_inst5  (net72  net82)  capacitor  c=C2 

Ci2  (net72  net64)  capacitor  c=100.0f 

C8  (net76  netl04)  capacitor  c=C3 

Ci3  (net76  netSS)  capacitor  c=100.0f 

Ci4  (netSO  net44)  capacitor  c=100.0f 

_inst6  (netSO  netSS)  capacitor  c=C4 

C6  (netSS  netll2)  capacitor  c=C3 

Ril  (netSS  netSO)  resistor  r=lM 

Ri2  (net72  netS4)  resistor  r=lM 

Ri3  (net7S  netSS)  resistor  r=lM 

Ri4  (netSO  net44)  resistor  r=lM 

E3  (net4S  0  netSO  net44)  vcvs  gain=mA4 

Av2d  (netSO  0  net72  netS4)  vcvs  gain=mA2 

E2  (netll2  0  net7S  netSS)  vcvs  gain=mA3 

E4  (netl24  net4S  _net3  0)  vcvs  gain=Qfb 

Avl  (net98  0  netSS  netSO)  vcvs  gain=Al 

Av2  (net82  0  net72  netS4)  vcvs  gain=A2 

Av3  (netl04  0  net7S  netSS)  vcvs  gain=A3 

Av4  (netSS  0  netSO  net44)  vcvs  gain=A4 

E7  (netlSS  netSS  _net0  0)  vcvs  gain=Qfb 

Avid  (net94  0  netSS  netSO)  vcvs  gain=mAl 

VO  (net0291  0)  vsource  type=sine  ampl=2S0 . Om  freq=998M 
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134  (_net3  net?)  delay_op  td=125p 
119  {_netO  nets)  delay_op  td=125p 

VI  (netl68  0)  vsource  type=pulse  val0=-l  vall=l  period=250p  rise=0p  \ 
fall=0p  width=125p 

linearDomain  start=0  stop=125p 

.  end 


The  above  netlist  is  a  representation  of  the  4*'^  order  delta-sigma  modulator  with 
linear  components.  The  1-bit  quantizer  used  is  the  only  non-linearity  in  the  circuit. 
We  next  explain  the  syntax  rules  of  some  of  the  components  in  the  netlist. 


Linear  Components 

Resistor 

Syntax:  Name  N+  N-  resistor  parameter=value 

N+,  N-:  nodes 
Supported  parameter;  r 
value :  Resistance  (Q 

Maximum  resistance  supported  is  /cmm 


Capacitor 

Syntax:  Name  N+  N-  capacitor  parameter=value 

N+,  N-:  nodes 
Supported  parameter;  c 
value :  Capacitance  (Farads) 


Inductor 

Syntax:  Name  N+  N-  inductor  parameter=value 

N+,  N-:  nodes 
Supported  parameter;  / 
value ;  Inductance  (Henry) 


109 


Ideal  Transformer 

Syntax:  Name  NP+  A/P-  NS+  NS-  xfmr  n1=value  n2=value 

NP-I-,  NP-:  primary  nodes 

NS+,  Ns-:  secondary  nodes 

Supported  parameters: 

n1 :  Number  of  turns  in  primary  winding 

n2:  Number  of  turns  in  secondary  winding 

An  ideal  transformer  is  modeled,  so  it  acts  as  a  transformer  at  DC 


Independent  Voltage  Source 

Syntax:  Name  N+  N-  vsource  parameter=value 

N+,  N-:  nodes 

Supported  parameters: 
type=  dc,  pulse,  sine 

freq=Frequency  of  sinusoidal  waveform  (Hz) 
amphPeak  ampiitude  of  sinusoidal  waveform  (Volts) 
rise=:Rise  time  for  pulse  waveform  (seconds) 
faii=Fall  time  for  pulse  waveform  (seconds) 
period=  Period  of  waveform  (seconds) 
width=  Pulse  width 


Independent  Current  Source 

Syntax:  Name  N+  N-  isource  parameter=value 

N+,  N-:  nodes 

Supported  parameters: 
type=  dc,  pulse,  sine 

freq=Frequency  of  sinusoidal  waveform  (Hz) 
ampi=Peak  amplitude  of  sinusoidal  waveform  (Amps) 
nse=Rise  time  for  pulse  waveform  (seconds) 
faii=Fall  time  for  pulse  waveform  (seconds) 
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period=  Period  of  waveform  (seconds) 
width=  Pulse  width 


Linear  voltage  controlled  current  source 

Syntax:  Name  N+  N-  NC+NC-  vccs  parameter=value 

N+,  N-:  source  and  sink  nodes 
NC+,  NC-;  Control  nodes 

Supported  parameters; 

gm=  transconductance  (siemens,  AA/) 


Linear  voltage  controlled  voltage  source 


Syntax:  Name  N+  N-  NC+NC-  vcvs  parameter=value 

N+,  N-;  source  and  sink  nodes 
NC+,  NC-:  Control  nodes 


Supported  parameters: 
gain=voltage  gain  (VA/) 

Linear  current  controlled  current  source 


Syntax:  Name  N+  N-  NC+NC-  cccs  parameter=value 

N+,  N-:  source  and  sink  nodes 
NC+,  NC-:  Control  nodes, 

The  control  nodes  are  defined  across  a  current  probe. 

Supported  parameters: 
gain=current  gain  (A/ A) 

Linear  current  controlled  voltage  source 

Syntax:  Name  N+  N-  NC+NC-  ccvs  parameter=value 

N+,  N-:  source  and  sink  nodes 
NC+,  NC-:  Control  nodes 

The  control  nodes  are  defined  across  a  current  probe 
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Supported  parameters: 
gain=trasresistance  (V/A,  ohms) 


Nonlinear  Components 

Diode 

Syntax:  Name  A  C  diode  parameter=value 
A:  Anode 
C:  Cathode 


Supported  parameters: 
is:  Saturation  Current  (A) 
vto:  Threshold  Voltage  (V) 
tt:  Transit  time  (sec) 
phi:  Junction  potential  (V) 

cjo:  Zero-bias  junction  bottom  capacitance  density  (F/nf) 
m:  Multiplier 

fcp:  Forward-bias  capacitance  coefficient 

Bipolar  Junction  Transistors 

Syntax:  Name  CBE  NPN/PNP  parameter=value 
C:  Collector 
B:  Base 
E:  Emitter 


Supported  parameters: 

area:  cross  section  area  (rrf) 

is:  saturation  current  (A) 

ise:  base-emitter  leakage  current  (A) 

isc:  base-collector  leakage  current  (A) 

bf:  beta  forward 
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br:  beta  reverse 

nf:  forward  emission  coefficient 

nr.reverse  emission  coefficient 

ne:  b-e  leakage  emission  coefficient 

nc:  b-c  leakage  emission  coefficient 

vaf:  forward  Early  voltage  (V) 

van  reverse  Early  voltage  (V) 

ikf:  forward  knee  current  (A) 

ikr:  reverse  knee  current  (A) 

cje:  capacitance,  base-emitter  junction  (F) 

vje:  voltage,  base-emitter  junction  (V) 

mje:  b-e  grading  exponential  factor 

cjc:  capacitance,  base-collector  junction(F) 

vjc:  voltage,  base-collector  junction(V) 

mjc:  b-c  grading  exponential  factor 


cjs;  capacitance,  collector-substrate  junction  (F) 

vjs:  voltage,  collector-substrate  junction(V) 

mjs:  c-s  grading  exponential  factor 

fc:  forward  bias  capacitance  factor 

tf:  ideal  forward  transit  time(s) 

xtf:  tf  bias  coefficient 

vtf:  tf-vbc  dependence  voltage(V) 

itf:  high  current  factor 

tr:  reverse  diffusion  capacitance(s) 


Differential  Quantizer  (quantizer2D) 

Syntax:  Name  In-h  In-  Out+  Out-  Clk  quantizer2  parameter=value 

ln+,  In-:  Quantizer  Input 

Out+,  Out-:  Quantizer  Output 

Clk:  Clock  input  (the  clock  is  not  differential) 

Supported  parameters: 

quantizer_vth:  Quantizer  threshold  level  (V) 

clk_vth:  Clock  threshold  level  (V) 

vievel:  Output  voltage  level  (V) 

trise:  Rise  time  (s) 

tfall:  Fall  time  (s) 


Miscellaneous  Components 
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Delay 

Syntax:  Name  In  Out  delay  parameter=value 
In:  Delay  Input 
Out:  Delay  Output 

Supported  parameters; 
td:  time  delay  (s) 

Interface  Elements  (AtoD,  DtoA) 

Added  Internally 

Sign 

Added  Internally 

To  get  better  convergence  HAARSPICE  adds  minimum  valued  resistors 
(i.e.GMIN)  and  Capacitors  (i.e  cmin). 


HAARSPICE  SOLVERS 


EIGENSPACE  LINEAR  DOMAIN  SOLVER  (eld) 

The  eigenspace  linear  domain  solver  (eld)  inside  HAARSPICE  performs  time- 
domain  analysis  using  a  Jacobian-free  eigenvalue  solving  algorithm. 
HAARSPICE-eld  gives  ~10X  speed  up  in  comparison  with  uniform  time-stepping 
and  adaptive  time-stepping  algorithms  which  are  used  in  generic  SPICE  and 
other  time-domain  circuit  simulators. 

Command 

eld  (filename  I -testcircuit)  log2 (timesteps)  run-time 
[-savenodes]  [-writecircuit] 


Attributes 

filename:  Currently  “HAARSPICE-eld”  supports  Cadence  Analog  Artist  generated 
“spectre  like”  netlist.  The  components  used  are  described  in  the  earlier  section. 

All  spectre  related  analyses  statements  are  ignored  by  HAARSPICE-eld. 

variables:  It  includes  two  sub-attributes 

timesteps  order:  <4-20>  number  of  timesteps 
runtime:  time  for  which  the  simulation  is  run 
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savenodes:  By  default  all  the  nodes  in  the  netlist  are  saved.  Also  includes  branch 
currents  or  pseudo-nodes  which  describe  the  currents  in  a  Modified  Nodal  Matrix. 

writecircuit:  Writes  out  information  about  circuit 


~ /eld/order# ,  totaltime 

for  example  a  directory  generated  by  running  a  simulation  for  4.098  fs  with  an 
order  of  15  (2^®)  is  named  “/eld/2Ei5, 4u" 


The  data  files  generated  are  in  ascii  text  format  and  have  4  columns  of  numbers 
in  double  float  type.  An  example  of  a  data  file 


<time> 

6.250000000000e-05 
1 .8750000000000-04 
3.1250000000000-04 


<valu0> 

4.1691600884450-11 

5.0000000052000-01 

5.0000000071460-01 


<fr0qu0ncy>  <sp0ctral  dansity  /w  Blackman  window> 


O.OOOOOOOOOOOO0-hOO 

2.4414062500000-01 

4.8828125000000-01 


-5.36522491 08370-t-OI 
-5.3124662069370-1-01 
-5.5OO7547O36730-hO1 


Example 


The  4**^  Order  CT  IModulator  described  in  the  Component  section  is  simulated 
using  HAARSPICE-eld  for  4.096 /sec  with  time-step  order  of  15  (i.e.  2^®  time 
steps).  The  number  of  time  steps  or  time  intervals  used  in  our  example  equals  to 
the  number  of  clock  pulses.  The  clock  is  running  at  4  GHz  with  a  pulse  width  of 
125ps.  Therefore  2^®  x  125ps  =  4.096  /sec 


Using  the  command 


>>  eld  sd4d  15  4096n 


“sd4d”  is  the  name  of  the  netlist  file. 


By  default  the  nodes  as  output  files  get  stored  in  the  directory 

~/SD4D_new-out/eld/2E15, 4u 


Avl :  1 

Av4  : 1 

139 ._inst0 : tl 

netl04 

_net2 

net56 

net72 

net86 

Avid : 1 

E2  : 1 

I39.K0:tl 

netll2 

_net3 

net60 

net76 

net94 

Av2  : 1 

E3  :  1 

_net0 

netl24 

net44 

net64 

nets 

net98 

Av2d: 1 

E4  :  1 

net0291 

netl56 

net46 

net68 

netSO 

VO  :p 

Av3  : 1 

E7  :  1 

netl 

netl68 

netSO 

net? 

net82 

VI  :p 

We  plot  the  spectrum  of  the  output  node  (i.e.  columns  3  and  4  from  “_net0”)  of 
the  modulator  using  gnuplot. 
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ator  using 


Performance  Metrics 

HAARSPICE-eld  performance  for  a  4**^  order  Band  Pass  Delta-Sigma  Modulator 
circuit  was  tested  on  a  DELL  C800  Laptop  running  Linux  (Red  Hat  7.02) 
operating  system. 


The  Table  shown  below  gives  the  test  results: 


HAARSPICE  -eld 

S  PIC  E/Spectre 

Platform 

63  seconds 

593  seconds 

Linux  7.02 

In  comparison  with  time-stepping  algorithms  used  in  SPICE,  HAARSPICE-eld 
gives  ~10X  improvement  in  speed. 

A  performance  metric  giving  the  measure  of  accuracy  is  the  Dynamic  Range  of 
the  simulator.  To  measure  the  DR  of  the  simulator  we  perform  a  two  tone  test  on 
the  4'^  order  Delta-Sigma  Modulator. 
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4th  order  DS  modulator 
non-linearity  :  v  +  0.01  ♦v‘^3.  v  —  v  I  +  v2 


frequency,  MHy. 

Fig.  4:  Spectrum  of  the  Two-Tone  Test  measured  at  the  output  for  different  number  of  clock 
cycles. 

The  Two  Tone  Test  has  one  input  signal  ui,  at  1008.5  MHz  with  magnitude  of 
500mV  and  a  second  signal  u2,  at  1007.5  MHz  with  a  magnitude  of  0.5mV.  The 
nonlinearity  introduced  is  through  the  1®*  Trans-conductance  Cell  (i.e.  Gm-Cell)  of 
the  Modulator.  The  only  other  nonlinearity  is  that  of  the  quantizer.  The  1®'  Gm- 
Cell  nonlinearity  generated  a  3^“^  order  harmonic  is  seen  at  2x1008.5  -  1007.5  = 
1009.5MHz  as  seen  in  Fig.  4. 


Performance  Metrics 


The  Spurious  Free  Dynamic  Range  in  the  Two-Tone  test  is  taken  as  the  Dynamic 
Range  of  the  simulator. 

From  Fig.  4,  one  can  compute  the  DR  to  be  Pmax  -  Pmin  =  -20dB  -  (-200)  dB  = 
180  dB 

To  resolve  the  order  Harmonic  and  get  a  DR  of  more  than  170  dB  it  took 
HAARSPICE-eld  1  Million  Clock  Cycles. 


The  Table  shown  below  gives  the  DR  with  two-tone  test: 


HAARSPICE  (eld  routine) 

SPICE/Spectre 

Clock  Cycles 

180  dB 

173  dB 

512  K 
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This  makes  HAARSPICE-eld  particularly  useful  in  Communication  circuits  and 
systems  which  require  high  Dynamic  Range. 


EIGENSPACE  EVOLUTION  NONLINER  DOMAIN  SOLVER  (ese) 

The  eigenspace  evolution  nonlinear  domain  solver  (ese)  inside  HAARSPICE 
does  time-domain  analysis  of  nonlinear  circuits  using  a  Jacobian-free  eigenvalue 
solving  algorithm.  HAARSPICE-ese  gives  accurate  results  when  compared  with 
SPICE/Spectre.  HAARSPICE-ese  code  is  not  yet  optimized  and  therefore  we 
haven’t  compared  it  with  SPICE  or  Spectre  for  speed. 

Command 

ese  ( -netlist : filename)  integration_order 

log2 (timesteps) run-time  [-savenodes]  [-writecircuit] 


Attributes 

filename:  Currently  “HAARSPICE-ese”  supports  Cadence  Analog  Artist 
generated  “spectre  like”  netlist.  The  components  used  are  the  described  in  the 
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Component  section  of  this  manual.  All  spectre  and  verilog-ams  related  analyses 
statements  are  ignored  by  HAARSPICE-ese. 

variables:  It  includes  three  sub-attributes 

integration  order:  <2-4>  3  recommended 
timestep  order:  <4-20>  number  of  timesteps 
runtime:  time  for  which  the  simulation  is  run 

savenodes:  By  default  all  the  nodes  in  the  netlist  are  saved.  Also  includes  the 
branch  currents  or  pseudo-nodes  which  describe  the  currents  in  a  Modified 
Nodal  Matrix. 

writecircuit:  Writes  out  information  about  the  circuit 


HAARSPICE 


Example 

We  choose  a  3-latch  Quantizer  to  test  ese.  Each  latch  is  designed  using  hbt 
devices.  The  Quantizer  is  clocked  with  a  pulse  of  300mV  p-p  at  a  period  of  250ps 
(i.e.  4  GHz)  and  with  50%  duty  cycle.  Below  is  the  Fig.  of  a  single  latch  circuit. 
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R3 

r-sao 


P2 

p>5O0 


The  quantizer  netlist  with  the  three  latches  and  clock  can  be  found  in 
HAARSPICE  examples  directory. 

Using  the  command 

>>  ese  NL_latch_test  3  10  lOn 

“NL_latch_test”  is  the  name  of  the  netlist.  The  integration  order  used  is  3  and  the 
simulation  is  run  for  10ns.  It  takes  approximately  400  seconds  to  finish  the 
simulation  on  a  DELL  C800  laptop  running  Linux  operating  system. 


I 


Fia.  4:  3-Latch  Quantizer  test  circuit  with  input  of  250mV  at  980MHz 


HAARSPICE 


The  HAARSPICE-ese  results  are  plotted  with  gnuplot  are  shown  below 
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Performance  Metrics 

HAARSPICE-ese  performance  for  a  3-latch  Quantizer  used  in  a  4*^  order  Band 
Pass  Delta-Sigma  Modulator  circuit  was  tested  on  a  DELL  C800  Laptop  running 
Linux  (RedHat  7.02)  operating  system. 

The  Table  shown  below  gives  the  test  results. 


HAARSPICE  (ese) 

HAARSPICE  (enl) 

Generic  Time-stepping  code 

Platform 

400  seconds 

>  5000  seconds 

Linux  7.02 

HAARSPICE-ese  is  not  optimized  and  does  not  use  a  sparse-matrix  solver.  We 
are  working  on  adding  ARPACK  routines  to  do  mathematical  computations.  We 
expect  the  optimized  code  to  give  x10  speed  up  in  comparison  with  uniform  and 
adaptive  time  stepping  code. 


EIGENSPACE  NONLINER  SOLVER  (enl) 


IBS3S 
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The  eigenspace  nonlinear  solver  (enl)  inside  HAARSPICE  does  time-domain 
analysis  of  nonlinear  circuits  using  a  Jacobian-based  Newton  algorithm. 
HAARSPICE-enI  gives  accurate  results  when  compared  with  SPICE/Spectre. 
HAARSPICE-enI  code  is  not  yet  optimized  and  therefore  we  haven’t  compared  it 
with  SPICE  or  Spectre  for  speed.  We  are  working  on  adding  ARPACK  routines  to 
do  mathematical  computations.  We  expect  the  optimized  code  to  have  similar 
speed  up  in  comparison  with  uniform  and  adaptive  time  stepping  code. 

Command 

enl  ( -netlist : filename)  integration_order 

log2 (timesteps) run-time  [-savenodes]  [-writecircuit] 


Attributes 

filename:  Currently  “HAARSPICE-enI”  supports  Cadence  Analog  Artist  generated 
“spectre  like”  netlist.  The  components  used  are  the  described  in  the  component 
section  of  this  manual.  All  spectre  and  verilog-ams  related  analyses  statements 
are  ignored  by  HAARSPICE-ese. 

variables:  It  includes  three  sub-attributes 

integration  order:  <2-4>  3  recommended 
timestep  order:  <4-20>  number  of  timesteps 
runtime:  time  for  which  the  simulation  is  run 

savenodes:  By  default  all  the  nodes  in  the  netlist  are  saved.  Also  includes  the 
branch  currents  or  pseudo-nodes  which  describe  the  currents  in  a  Modified 
Nodal  Matrix. 

writecircuit:  Writes  out  information  about  the  circuit 

HAARSPICE-enI  and  HAARSPICE-ese  have  similar  syntax  and  the  same  3-latch 
Quantizer  netlist  shown  in  the  earlier  section  can  be  used  to  test  HAARSPICE- 
enI.  The  simulation  with  HAARSPICE-enI  will  be  10X  slower  in  comparison  with 
HAARSPICE-ese. 


HAARSPICE 


DC  Operating  Point  (dcp) 
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Although  our  goal  was  to  develop  a  fast  time  domain  simulator,  we  observed  that 
initial  conditions  (i.e.  DC  Operating  Point)  significantly  influenced  the  time  domain 
simulation  results.  A  fast  and  accurate  algorithm  for  calculating  the  DC  Operating 
point  was  required.  We  investigated  various  Homotopy/Continuation  methods 
and  found  GMIN  Stepping  and  Arc  length  Homotopy  methods  to  be  the  most 
suitable  for  our  purpose.  Inside  HAARSPICE-dcp  we  have  used  both  these 
methods.  Except  for  HAARSPICE-eld,  HAARSPICE-dcp  is  run  internally  during 
every  time  domain  simulation  to  compute  the  DC  operating  conditions  for 
nonlinear  circuits.  By  default  GMIN  stepping  is  used. 


Command 

dcp  (filename) 


Attributes 

filename:  Currently  “HAARSPICE-dcp”  supports  Cadence  Analog  Artist 
generated  “spectre  like”  netlist.  The  components  used  are  the  described  in  the 
Component  section  of  this  manual.  All  spectre  and  verilog-ams  related  analyses 
statements  are  ignored  by  HAARSPICE-dcp. 

Output:  The  DC  operating  point  results  are  saved  in  files  "f  ilename  .bic"  (i.e. 
Binary  format)  and  "filename .  tic"(i.e.  Text  format) 

Example 

We  choose  the  same  3-latch  Quantizer  netlist  used  in  the  section  on 
HAARSPICE-ese  to  test  HAARSPICE-dcs 

Using  the  command 

>>  dcp  NL_latch_test 

The  HAARSPICE-dcs  results  are  stored  inside  a  binary  file  “NL_latch_test.bic” 
and  in  a  text  format  inside  “NLJatch_test.tic” 


DC  Voltage  Sweep  (dcs) 
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DC  Voltage  sweep  inside  HAARSPICE  was  originally  written  to  test  Transistor 
and  Diode  l-V  characteristics  and  is  seldom  used  to  analyze  large  circuits. 


Command 

dcs  (filename) 


Attributes 

filename:  Currently  “HAARSPICE-dcs”  supports  Cadence  Analog  Artist 
generated  “spectre  like”  netlist.  The  components  used  are  described  in  the  earlier 
sections.  HAARSPICE-dcs  unlike  other  solvers  use  the  "dc"  analysis  statement 
from  the  spectre  netlist 

Output:  The  results  is  stored  in  a  directory  with  the  name 

" . /f ilename-out/dcsweep" 


Example 

We  use  a  simple  Diode  circuit  to  test  HAARSPICE-dcs.  The  diode  voltage  is 
swept  from  -40  to  5  volts.  The  netlist  used  can  be  found  in  the  examples 
directory. 

Using  the  command 

>>  dcs  diodebvtest 


The  desweep  results  are  stored  inside  “.  /diodebvtest-out/dcsweep" 
The  directory  listing  shows  the  following  files. 


DO:ROut_rs  table  Vdd  VDD:p 

The  "table"  file  list  the  following  contents,  which  can  be  used  to  make  plots 


Vdd 

-4.0000000000000+01 

-3.9750000000000+01 

-3.9500000000000+01 

-3.9250000000000+01 


VDD:p 

1 .0463988223460+00 
8.0323572589100-01 
5.6244812450410-01 
3.2650770052140-01 


D0:ROut_rs 

-3.8953601177730+01 

-3.8946764274190+01 

-3.8937551875580+01 

-3.8923492299560+01 


The  output  from  HAARSPICE-dcs  can  be  plotted  using  gnuplot 


With  the  gnuplot  commands: 

gnuplot>  set  data  styles  lines 

gnuplot>  plot  "diodebvtest-out/dcsweep/table"  us  l:(-$2) 


Example  Circuits  and  Error  Reporting 

The  CD  supplied  with  this  HAARSPICE  User  Manual  contains  the  netlists  of 
circuits  which  can  be  run  and  the  results  plotted  with  gnuplot. 

The  user  can  download  gnuplot  from  the  following  www  site. 

ht tp : // WWW . gnuplot . info/ download . html 

The  example  circuits  provided  with  the  CD: 

a.  4'^  Order  Continuous  Time  Band  Pass  Delta-Sigma  Modulator  (sd4d) 

b.  4*'^  Order  Continuous  Time  Band  Pass  Delta-Sigma  Modulator  with 
nonlinearity  introduced  through  input  (for  two  tone  tests)  (sd4d_2tone) 

c.  2"^  Order  Continuous  Time  Low  Pass  Delta-Sigma  Modulator  (sd2d) 

d.  Gummel-Poon  Transistor  3-Latch  Quantizer  (NL_latch_test) 

e.  Simple  Diode  Circuit  for  DC  Sweep  (diode_test) 

There  are  ready  to  use  shell  scripts  available  with  each  of  the  examples. 

Any  errors  or  comments  regarding  HAARSPICE  can  be  sent  to 

John  Visher  (visher@hrl.com) 

Rahul  Shringarpure  (rahuls@hrl.com) 

Peter  Petre  (Detre@hrl.com) 
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Broadband  efficient  macromodeling  program  (BEMP)  is  a  Windows-based 
software  package  that  captures  the  frequency  response  generated  from  an 
electromagnetic  tool  or  measurements  and  generates  both  SPICE  net  list  and  poles  and 
residue^  of  macrnmodels  for  design  and  simulation  of  mnlti-GHz  systems  as  shown  in 


Fig.  1. 


Broadband  Efficient  Macromodeling  Program  (BEMP) 


e«ieiB 


Poles  &  Residues 


~tl  »«KHg- _ _ 

Input  data 
I  (Touchstone  flle) 

'oEi 


Options 


Run  &  Plot 


Model  Order  Reduction 


Fig.  1  Broadband  Efficient  Macromodeling  Program 
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When  users  click  the  Setup  button,  BEMP  prompts 
to  select  or  type  a  Touchstone  file.  The  extension  of  the 
Touchstone  file  is  sNp,  where  N  represents  the  number  of 
ports.  Please  refer  to  Appendix  for  details 
Click  Open  button. 


BEMP  provides  two  methods  for  the  construction  of 
macromodels:  Manual  and  Automatic.  If  users  want  to  use 
Manual  option,  please  refer  to  Appendix  for  details.  When 
users  choose  Automatic  option,  BEMP  will  automatically  use 
its  own  methods.  A  developer,  Sung-Hwan  Min,  investigates 
more  robust,  sophisticate  algorithms  for  the  Automatic 
option. 

BEMP  also  provide  an  option  for  selecting  the  ports  if 
users  want  to  construct  a  certain  port  of  macromodels.  Please 
refer  to  Appendix  for  details  if  users  want  to  select  the  ports. 

Click  OK  button. 
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BEMP  loads  the  original  frequency  data  and  show 
the  number  of  ports  and  frequency  samples.  BEMP 
initializes  the  window  to  show  the  comparison  between  the 
original  data  and  the  response  of  macromodels  in  logarithm 
scale. 

Click  Run  button. 


BEMP  generates  the  macromodels  by  showing  the 
status  below  Setup  button. 

S(  177  314)  says  that  a  subband  having  samples  from 
177  and  314  is  being  constructed. 

LP;1  says  that  BEMP  captures  1  low-pass  pole. 
BP:32  says  that  BEMP  captures  32  band-pass  poles.  HP:0 
says  that  BEMP  captures  0  high-pass  pole.  AP:0  says  that 
BEMP  captures  0  all-pass  pole. 

Note  that  the  order  of  macromodels  is  33. 


If  users  want  to  stop  constructing  poles,  click  Stop  button 
If  not,  BEMP  runs  forever. 


After  constructing  the  macromodels,  BEMP  gives  a 
function  to  show  and  reduce  the  orders  of  models.  Users  can 
see  and  reduce  the  orders  by  specifying  the  frequency  range 
of  interest  or  the  minimum  magnitude  of  the  frequency 
response. 

If  Show  button  is  clicked  after  specifying  the  options, 
users  can  see  the  performance. 

If  Save  button  is  clicked  after  specifying  the  options, 
users  can  save  the  output  of  BEMP. 

If  Exit  button  is  clicked,  users  can  exit. 
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The  spurious  or  redundant  poles  are  removed  by 
comparing  the  frequency  range  with  the  imaginary  value  of 
poles.  For  example,  if  users  have  a  range  of  5MHz  to  2GHz, 
then  all  low-pass  poles  are  removed  and  all  band-pass  poles 
located  outside  the  band  of  interest  are  removed. 

For  example  provided,  users  specify  the  frequency 
range  from  OHz  to  5GHz.  The  BP  order  is  changed  from  34  to 
22.  Users  can  see  the  performance  of  MOR  of  BEMP. 


BEMP  calculates  the  maximum  magnitude  of  the 
original  data  (root-mean-squares).  Users  can  select  the 
minimum  magnitude  of  the  frequency  response  of  each 
macromodel. 

For  example  specified,  if  users  specify  the  5% 
percent  of  the  maximum  magnitude  of  the  frequency 
response,  the  BP  order  is  changed  from  34  to  10.  Users  can 
see  the  performance  of  MOR  of  BEMP. 


Step  7-b.  Save  Option 


BEMP  save  the  output  such  as  1)  the  frequency 
response  of  macromodels,  2)  the  poles  and  residues  of 
macromodels,  and  3)  SPICE  net  lists.  In  default,  the  port 
information  is  attached  in  the  original  filename  in  order  to 
distinguish  the  original  data  and  the  output  of  BEMP.  For 
example,  "Powerplane.slp"  is  changed  to 

"Powerplane_lofl.slp".  But,  users  can  change  the  name  of 
the  file. 

The  extension  of  the  file  (Touchstone  file)  having  the 
frequency  response  of  macromodels  is  the  same  as  the 
format  of  a  Touchstone  file. 
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The  extension  of  the  file  (Macromodel  file)  having 
the  poles  and  residues  of  macromodels  starts  with 
'm'. 


The  extension  of  the  file  having  SPICE  net  lists  starts 
with  V. 


Setup 

Stepg^Sr 


1  '  I  firrtfH 

V. 

BEMP  provide  Plot  button  to  show  the  accuracy  of 
macromodel. 

Click  Plot  button. 


Port;  fTjEJ  FmE 


Data 

A  Real 

C  Imaginary 
(•  Both 

rPlot - 

A  Normal 

A  Magnitude 

A  Logarithm 

OK 

Cancel 

BEMP  provides  a  function  to  see  the  output  of  the 
frequency  response.  Users  can  select  the  specific  port  with  a 
specific  data  and  plot  formats. 
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BEMP  shows  the  graphical  comparison  between  the 
original  data  and  the  frequency  response  of  macromodels. 

Blue:  Original  data 

Red:  Frequency  response  of  macromodels 
Green:  Error 


Click  Exit  button. 


Appendix 

1.  Touchstone  file  format 

#  [Frequency  Unit]  [Parameter  Type]  [Data  Format]  [R  Reference  Impedance] 

Frequency  Unit:  Hz,  KHz,  MHz,  GHz,  THz 


Parameter  Type:  S,  Y,  Z  (S  for  scattering,  Y  for  admittance,  S  for  impedance) 
Data  Format:  MA,  DB,  RI  (MA  for  magnitude-angle,  DB  for  dB-angle,  RI  for 

real-imaginary) 


Reference  Impedance:  The  reference  resistance  [ohm] 
Example  l:#GHzSMA  50 
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Example  2:  #  GHz  Y  RI  1.0 

Examples  of  Y  parameters  for  four  ports  in  RI  data  format 


#HZ  YRIR  1.0 

Freq  Yrll  Yill 
Yrl2  Yil2 
Yrl3  Yil3 
Yrl4  Yil4 
Yr21  Yi21 
Yr22  Yi22 
Yr23  Yi23 
Yr24  Yi24 
Yr31  Yi31 
& 


2.  Manual  Option 

If  users  want  to  use  a  manual  construction,  BEMP  needs  a  file  containing  the 
construction  options.  The  format  used  in  BEMP  is  : 

Band(A  B)  Order(C  D)  Filter(EFGH  IJKL)  Basis(MNOP)  PoIeReplacement(Q) 

Band(A  B):  A  for  starting  frequency  sample  and  B  for  ending  frequency  sample 
of  the  subband 

Order(C  D):  C  for  the  order  of  numerator  and  D  for  the  order  of  denominator 
Filter(EFGH  IJKL):  EFGH  for  pre-filter  of  poles  of  broadband 

macromodels  and  IJKL  for  post-filter  of  poles  of  subband  macromodels. 

1.  E  and  I  sections:  Users  can  provide  a  character  R  for  removing  the  low-pass 
poles  and  a  dash  character  for  holding  the  low-pass  poles. 

2.  F  and  J  sections:  Users  can  provide  several  characters:  A,  D,  K,  L,  G,  H,  R,  B 

A  for  not  removing  the  poles 
D  for  removing  all  poles 

K  for  removing  the  poles  below  the  starting  point  of  the 
subband 
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L  for  removing  the  poles  below  the  ending  point  of  the  subband 
G  for  removing  the  poles  above  the  starting  point  of  the 
subband  * 

H  for  removing  the  poles  above  the  ending  point  of  the 
subband 

R  for  removing  the  poles  within  the  subband 
B  for  removing  the  poles  below  the  starting  point  of  the 
subband  or  above  the  ending  point  of  the  subband 


3.  G  and  K  sections:  Users  can  provide  a  character  R  for  removing  the  high-pass 
poles  and  a  dash  character  for  holding  the  high-pass  poles. 

4.  H  and  L  sections:  Users  can  provide  a  character  R  for  removing  the  all-pass 
residues  and  a  dash  character  for  holding  the  all-pass  residues. 

Basis(MNOP):  MNOP  for  basis  functions 

1 .  M  section:  Users  can  provide  a  character  L  for  the  low-pass  poles  and 

a  dash  character  for  removing  the  low-pass  poles. 

2.  N  section:  Users  can  provide  a  character  B  for  the  band-pass  poles  and 

a  dash  character  for  removing  the  band-pass  poles. 

3.  O  section:  Users  can  provide  a  character  H  for  the  high-pass  poles  and 

a  dash  character  for  removing  the  high-pass  poles. 

4.  P  section:  Users  can  provide  a  character  L  for  the  all-pass  residues  and 

a  dash  character  for  removing  the  all-pass  residues. 

PoleReplacement(Q):  Q  for  the  number  of  iteration 
Example: 


Band(  1  1 10)Order(20  21)  Filter(RR-  -B-)  Basis(LB-)  PoleReplacement(l)  Band(  1 1 1  200)0rder(20  21)  Filter(- 
R-  -B--)  Basis(-B-)  PoleRepIacement(l)  Band(  201  300)0rder(20  2 1 )  Filter(-R-  -B-)  Basis(-B-) 
PoleReplacement(l)  Band(  301  400)0rder(20  21)  Filter(-R-  -B-)  Basis(-B-)  PoleReplacement(  1 )  Band(  401 
500)0rder(20  21)  Filter(-R-  -B-)  Basis(-B-)  PoleReplacement(l)  Band(  501  600)0rder(20  21)  Filter(-R--  -B-) 
Basis(-B-)  PoleReplacement(l) 
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Band(  601  700)0rder(20  21)  Filter(-R—  -B— )  Basis(-B-)  PoleReplacement(l)  Band(  701  800)0rder(20  21)  Filter(- 
R-  -B-)  Basis(-B--)  PoleReplacement(l)  Band(  801  900)0rder(20  21)  Filter(-R-  -B--)  Basis(-B-) 
PoleReplacement(l)  Band(  901  1000)  Order(20  21)  Filter(-R-A  -G-)  Basis(-B-A)  PoleReplacement(l) 


Example: 


Band(  1  100)0rder(10  1 1)  Filter(RRRR  — )  Basis(LB-)  PoleRep!acement(2)  Band(  101  200)0rder(10  1 1)  Filter(- 
K — G— )  Basis(-B“)  PoleReplacement(2)  Band(  201  300)0rder(10  1 1)  Filter(-K — G— )  Basis(-B— ) 
PoleReplacement(2)  Band(  301  400)0rder(20  1 1)  Filter(-K--  -G— )  Basis(-B~)  PoleReplacement(2) 


Reference: 

BEMP  uses  efficient  constructions  methods;  1)  multiport  passivity  formulae 
[2,3],  2)  band  division  [2,3],  3)  subband  reordering  [2,3],  4)  subband  dilation  [2,3],  5) 
pole  replacement  [2,3],  6)  automatic  construction  sequences,  and  so  on.  BEMP  uses  the 
least  squares  approximation  (version  1.0,  version  2.0,  version  3.0)  [4]  and  vector  fitting 
[5]  (version  3.0)  for  approximating  the  frequency  response. 

[1]  S.  H.  Min  and  M.  Swaminathan,  "Efficient  constriction  of  two-port  passive 
macromodels  for  resonant  networks,"  IEEE  10th  Topical  Meeting  on  EPEP,  Oct.  2001. 

[2]  S.  H.  Min  and  M.  Swaminathan,  "Construction  of  broadband  passive  macromodels 
from  frequency  data  for  simulation  of  distributed  interconnect  networks,"  IEEE  EMC- 
Zurich,  Feb.  2003. 

[3]  S.  H.  Min  and  M.  Swaminathan,  "Construction  of  broadband  passive  macromodels 
from  frequency  data  for  simulation  of  distributed  interconnect  networks,"  accepted  for 
publication  on  IEEE  Trans.  Electromagnetic  Compatibility,  2003. 


[4]  K.  L.  Choi  and  M.  Swaminathan,  "Development  of  model  libraries  for  embedded 
passives  using  network  synthesis,"  IEEE  Trans.  ADSP,  vol.  21,  no.  1,  Feb.  1998.  [5]  B. 
Gustavsen  and  A.  Semiyen,  "Rational  approximation  of  frequency  domain  responses  by 
vector  fitting,"  IEEE  Trans.  Power  Delivery,  vol. 14,  no.  3,  July  1999. 


135 


7.3 


Published  Papers  and  Presentations 


1.  R.  Shringarpure,  T.  Kaplan,  G.  Valley,  P.  Petre,  and  J.  Visher, ’’Modeling  Thermal  Induced  Inter-Symbol 
Interference  of  Feedback  DACs  in  Delta-Sigma  Modulators”,BMAS  2004,  21-22  October  2004. 

2.  S.  D.  Gedney,  A.  Zhu,  W.-H.  Tang,  G.  Liu,  and  P.  Petre,”  A  Fast,  High-Order  Quadrature  Sampled  Pre- 
Corrected  FFT  for  Electromagnetic  Scattering,"  Microwave  and  Optical  Technology  Letters,  vol.  36,  no.  5,  pp. 
343-349,  March  5  2003. 

3.  A.  Zhu  and  S.  D.  Gedney,  "A  Quadrature  Sampled  Pre-Corrected  FFT  Method  for  the  Electromagnetic 
Scattering  from  Inhomogeneous  Objects,"  IEEE  Antennas  and  Wireless  Propagation  Letters,  vol.  2,  no.  1,  pp. 
50-53,  2003. 

4.  S.  Min  and  M.  Swaminathan,  ’’Efficient  Construction  of  Passive  Macro-models  for  Resonant  Networks”, 
Proceedings  of  the  10th  Topical  Meeting  on  Electrical  Performance  of  Electronic  Packaging,  Boston, 
Massachusetts,  Oct.  2001. 

5.  S.  Min  and  M.  Swaminathan,  “Construction  of  broadband  passive  macro-models  from  frequency  data  for 
simulation  of  distributed  interconnect  networks”,  Proceedings  of  the  EMC  Zurich  ’03,  ETH  Zentrum,  Zurich, 
Switzerland,  Feb.  2003. 

6.  S.  Min  and  M.  Swaminathan,  "Construction  of  broadband  passive  macro-models  from  frequency  data  for 
distributed  interconnect  networks",  published  in  the  IEEE  Transactions  on  Electromagnetic  Compatibility,  Nov 
2004. 

7.  R.  Mandrekar  and  M.  Swaminathan,  “Delay  Extraction  from  Frequency  Domain  Data  for  Causal  Macro¬ 
modeling  of  Passive  Networks”,  Submitted  to  the  International  Symposium  on  Circuits  and  Systems  2005. 

8.  R.  Mandrekar,  M.  Swaminathan,  Sungjun  Chun,  “Application  of  wavelets  and  generalized  pencil-of-function 
method  for  the  extraction  of  noise  current  spectrum  and  simulation  of  simultaneous  switching  noise,” 
Proceedings.  17th  International  Conference  on  VLSI  Design,  2004  Pages:995  -  1000 

9.  B.  Mutnury,  M.  Swaminathan  and  J.  Libous,  "Macro-Modeling  of  Non-Linear  I/O  Drivers  using  Spline 
Functions  and  Finite  Time  Difference  Approximation",  Proceedings  of  12th  Topical  meeting  on  EPEP,  Oct 
2003.  Pages:273- 276 

10.  B.  Mutnury  and  M.  Swaminathan,  "Macro-Modeling  Methodology  For  Weakly  Non-Linear  Digital  Drivers", 
Proceedings  of  SPI  workshop  2003,  Italy. 

11. Bhyrav  Mutnury,  Madhavan  Swaminathan  and  James  Libous,  “Black-Box  Modeling  of  Non-linear  I/O 
Drivers,”  The  Applied  Computational  Electromagnetic  Society  (ACES),  Syracuse,  NY,  April  19-23,  2004 

12.  Bhyrav  Mutnury,  Madhavan  Swaminathan  and  Jim  Libous,  “Modeling  of  Power  Supply  Noise  using  Efficient 
Macro-Model  of  Non-Linear  Driver,”  accepted  for  2004  IEEE  International  Symposium  on  EMC,  San  Jose 
(CA),  USA,  August  9-13,  2004,  Pages:  988-993. 

13.  Bhyrav  Mutnury,  Madhavan  Swaminathan  and  Jim  Libous,  “Macro-Modeling  of  Non-Linear  Digital  EO 
Drivers,”  accepted  lox  IEEE  Transactions  on  Advanced  Packaging,  Feb’  2005. 

14.  Bhyrav  Mutnury,  Madhavan  Swaminathan,  Moises  Cases,  Nam  Pham,  Daniel  N.  de  Araujo  and  Erdem 
Matoglu,  "Macro-Modeling  of  Transistor  Level  Receiver  Circuits,"  accepted  at  13*  Tropical  Meeting  on 
Electrical  Performance  of  Electronic  Packaging  (EPEP  2004),  Portland,  Oregon,  October,  2004  (Received  Intel 
Best  Student  Paper  Award). 

15.  R.  Mandrekar,  M.  Swaminathan,  “Wavelet  based  nonparametric  regression  approach  for  de-noising  and 
modeling  of  transient  switching  noise  measurements”.  Electronics  Packaging  Technology,  2003  5th  Conference 
(EPTC  2003),  Dec.  2003  Pages:39  -  44. 


136 


